N EE 
mem = 
-5 6E 
COOK 
=~ Ssk 
aa, (ol z 
Cs LE 
>. oR 
oS OR 
See 
cOsE 
=A: 


S 


e 
Wie by 


EFIRI 


P 


Introduction to 
Linear Algebra 
with Applications 


Introduction to 
Linear Algebra 
with Applications 


Jim DeFranza 


St. Lawrence University 


Daniel Gagliardi 


SUNY Canton 


WAVELAND 


PRESS, INC. 


Long Grove, Illinois 


For information about this book, contact: 
Waveland Press, Inc. 
4180 IL Route 83, Suite 101 
Long Grove, IL 60047-9580 
(847) 634-0081 
info@waveland.com 
www.waveland.com 


Copyright © 2009 by Jim DeFranza and Daniel Gagliardi 
Reissued 2015 by Waveland Press, Inc. 


10-digit ISBN 1-4786-2777-8 
13-digit ISBN 978-1-4786-2777-7 


All rights reserved. No part of this book may be reproduced, stored in a retrieval 
system, or transmitted in any form or by any means without permission in writing from 
the publisher. 


Printed in the United States of America 


7 6 5 4 3 2 1 


To Regan, Sara, and David 
—JD 


To Robin, Zachary, Michael, and Eric 
—DG 


About the Authors | 


Jim DeFranza was born in 1950 in Yonkers New York and grew up in Dobbs 
Ferry New York on the Hudson River. Jim DeFranza is Professor of Mathematics 
at St. Lawrence University in Canton New York where he has taught undergraduate 
mathematics for 25 years. St. Lawrence University is a small Liberal Arts College 
in upstate New York that prides itself in the close interaction that exists between 
students and faculty. It is this many years of working closely with students that has 
shaped this text in Linear Algebra and the other texts he has written. He received his 
Ph.D. in Pure Mathematics from Kent State University in 1979. Dr. DeFranza has 
coauthored PRECALCULUS, Fourth Edition and two other texts in single variable 
and multivariable calculus. Dr. DeFranza has also published a dozen research articles 
in the areas of Sequence Spaces and Classical Summability Theory. Jim is married 
and has two children David and Sara. Jim and his wife Regan live outside of Canton 
New York in a 150 year old farm house. 


Daniel Gagliardi is an Assistant Professor of Mathematics at SUNY Canton, in 
Canton New York. Dr. Gagliardi began his career as a software engineer at IBM 
in East Fishkill New York writing programs to support semiconductor development 
and manufacturing. He received his Ph.D. in Pure Mathematics from North Carolina 
State University in 2003 under the supervision of Aloysius Helminck. Dr. Gagliardi’s 
principle area of research is in Symmetric Spaces. In particular, his current work 
is concerned with developing algorithmic formulations to describe the fine structure 
(characters and Weyl groups) of local symmetric spaces. Dr. Gagliardi also does 
research in Graph Theory. His focus there is on the graphical realization of certain 
types of sequences. In addition to his work as a mathematician, Dr. Gagliardi is an 
accomplished double bassist and has recently recorded a CD of jazz standards with 
Author/Pianist Bill Vitek. Dr. Gagliardi lives in northern New York in the picturesque 
Saint Lawrence River Valley with his wife Robin, and children Zachary, Michael, 
and Fric. 


vii 


viii 


CHAPTER 1 


CHAPTER 2 


Contents 


Preface xi 


Systems of Linear Equations and Matrices 1 


1.1 Systems of Linear Equations 2 
Exercise Set 1.1 12 


1.2 Matrices and Elementary Row Operations 14 
Exercise Set 1.2 23 


1.3 Matrix Algebra 26 
Exercise Set 1.3 37 


1.4 The Inverse of a Square Matrix 39 
Exercise Set 1.4 45 


1.5 Matrix Equations 48 
Exercise Set 1.5 51 


1.6 Determinants 54 
Exercise Set 1.6 65 


1.7 Elementary Matrices and LU Factorization 68 
Exercise Set 1.7 77 


1.8 Applications of Systems of Linear Equations 79 
Exercise Set 1.8 84 


Review Exercises 89 
Chapter Test 90 


Linear Combinations and Linear Independence 


2.1 Vectors in R” 94 
Exercise Set 2.1 99 


2.2 Linear Combinations 101 
Exercise Set 2.2 108 


2.3 Linear Independence 111 
Exercise Set 2.3 120 


Review Exercises 123 
Chapter Test 125 


93 


CHAPTER 3 


CHAPTER A 


CHAPTER 5 


Contents 


Vector Spaces 127 


3.1 Definition of a Vector Space 129 
Exercise Set 3.1 137 


3.2 Subspaces 140 
Exercise Set 3.2 154 


3.3 Basis and Dimension 156 
Exercise Set 3.3 171 


3.4 Coordinates and Change of Basis 173 
Exercise Set 3.4 182 


3.5 Application: Differential Equations 185 
Exercise Set 3.5 193 


Review Exercises 194 
Chapter Test 195 


Linear Transformations 199 


4.1 Linear Transformations 200 
Exercise Set 4.1 211 


4.2 The Null Space and Range 214 
Exercise Set 4.2 223 


4.3 Isomorphisms 226 
Exercise Set 4.3 233 


4.4 Matrix Representation of a Linear Transformation 235 
Exercise Set 4.4 245 


4.5 Similarity 249 
Exercise Set 4.5 253 


4.6 Application: Computer Graphics 255 
Exercise Set 4.6 268 


Review Exercises 270 
Chapter Test 272 


Eigenvalues and Eigenvectors 275 


5.1 Eigenvalues and Eigenvectors 276 
Exercise Set 5.1 285 

5.2 Diagonalization 287 
Exercise Set 5.2 298 

5.3 Application: Systems of Linear Differential Equations 300 
Exercise Set 5.3 309 


Contents 


CHAPTER 6 


5.4 


Application: Markov Chains 310 
Exercise Set 5.4 315 


Review Exercises 316 
Chapter Test 318 


Inner Product Spaces 321 


6.1 


6.2 


6.3 


6.4 


6.5 


6.6 


6.7 


6.8 


The Dot Product on R” 323 
Exercise Set 6.1 331 


Inner Product Spaces 333 
Exercise Set 6.2 341 


Orthonormal Bases 342 
Exercise Set 6.3 352 


Orthogonal Complements 355 
Exercise Set 6.4 364 


Application: Least Squares Approximation 366 
Exercise Set 6.5 375 


Diagonalization of Symmetric Matrices 377 
Exercise Set 6.6 383 


Application: Quadratic Forms 385 
Exercise Set 6.7 392 


Application: Singular Value Decomposition 392 
Exercise Set 6.8 403 


Review Exercises 404 
Chapter Test 406 


Appendix 409 
Answers to Odd-Numbered Exercises 440 
Index 479 


Preface 


Introduction to Linear Algebra with Applications is an introductory text targeted to 
second-year or advanced first-year undergraduate students. The organization of this 
text is motivated by what our experience tells us are the essential concepts that students 
should master in a one-semester undergraduate linear algebra course. The centerpiece 
of our philosophy regarding the presentation of the material is that each topic should 
be fully developed before the reader moves onto the next. In addition, there should be 
a natural connection between topics. We take great care to meet both of these objec- 
tives. This allows us to stay on task so that each topic can be covered with the depth 
required before progression to the next logical one. As a result, the reader is prepared 
for each new unit, and there is no need to repeat a concept in a subsequent chapter 
when it is utilized. 

Linear algebra is taken early in an undergraduate curriculum and yet offers the 
opportunity to introduce the importance of abstraction, not only in mathematics, but in 
many other areas where linear algebra is used. Our approach is to take advantage of this 
opportunity by presenting abstract vector spaces as early as possible. Throughout the 
text, we are mindful of the difficulties that students at this level have with abstraction 
and introduce new concepts first through examples which gently illustrate the idea. 
To motivate the definition of an abstract vector space, and the subtle concept of 
linear independence, we use addition and scalar multiplication of vectors in Euclidean 
space. We have strived to create a balance among computation, problem solving, and 
abstraction. This approach equips students with the necessary skills and problem- 
solving strategies in an abstract setting that allows for a greater understanding and 
appreciation for the numerous applications of the subject. 


Pedagogical Features 


1. Linear systems, matrix algebra, and determinants: We have given a stream- 
lined, but complete, discussion of solving linear systems, matrix algebra, determi- 
nants, and their connection in Chap. 1. Computational techniques are introduced, 
and a number of theorems are proved. In this way, students can hone their 
problem-solving skills while beginning to develop a conceptual sense of the fun- 
damental ideas of linear algebra. Determinants are no longer central in linear 
algebra, and we believe that in a course at this level, only a few lectures should 
be devoted to the topic. For this reason we have presented all the essentials on 
determinants, including their connection to linear systems and matrix inverses, 
in Chap. 1. This choice also enables us to use determinants as a theoretical tool 
throughout the text whenever the need arises. 


xi 


xii 


Preface 


Vectors: Vectors are introduced in Chap. 1, providing students with a familiar 
structure to work with as they start to explore the properties which are used later 
to characterize abstract vector spaces. 


Linear independence: We have found that many students have difficulties with 
linear combinations and the concept of linear independence. These ideas are fun- 
damental to linear algebra and are essential to almost every topic after linear 
systems. When students fail to grasp them, the full benefits of the course cannot 
be realized. In Introduction to Linear Algebra with Applications we have devoted 
Chap. 2 to a careful exposition of linear combinations and linear independence 
in the context of Euclidean space. This serves several purposes. First, by placing 
these concepts in a separate chapter their importance in linear algebra is high- 
lighted. Second, an instructor using the text can give exclusive focus to these ideas 
before applying them to other problems and situations. Third, many of the impor- 
tant ramifications of linear combinations and linear independence are considered 
in the familiar territory of Euclidean spaces. 


. Euclidean spaces R”: The Euclidean spaces and their algebraic properties are 


introduced in Chap. 2 and are used as a model for the abstract vectors spaces of 
Chap. 3. We have found that this approach works well for students with limited 
exposure to abstraction at this level. 


Geometric representations: Whenever possible, we include figures with geomet- 
ric representations and interpretations to illuminate the ideas being presented. 


New concepts: New concepts are almost always introduced first through concrete 
examples. Formal definitions and theorems are then given to describe the situation 
in general. Additional examples are also provided to further develop the new idea 
and to explore it in greater depth. 


True/false chapter tests: Each chapter ends with a true/false Chapter Test with 
approximately 40 questions. These questions are designed to help the student 
connect concepts and better understand the facts presented in the chapter. 


Rigor and intuition: The approach we have taken attempts to strike a balance 
between presenting a rigorous development of linear algebra and building intu- 
ition. For example, we have chosen to omit the proofs for theorems that are not 
especially enlightening or that contain excessive computations. When a proof is 
not present, we include a motivating discussion describing the importance and 
use of the result and, if possible, the idea behind a proof. 

Abstract vector spaces: We have positioned abstract vector spaces as a central 
topic within Introduction to Linear Algebra with Applications by placing their 
introduction as early as possible in Chap. 3. We do this to ensure that abstract 
vector spaces receive the appropriate emphasis. In a typical undergraduate math- 
ematics curriculum, a course on linear algebra is the first time that students are 
exposed to this level of abstraction. However, Euclidean spaces still play a central 
role in our approach because of their familiarity and since they are so widely 
used. At the end of this chapter, we include a section on differential equations 
which underscores the need for the abstract theory of vector spaces. 


Preface xiii 


10. Section fact summaries: Each section ends with a summary of the important facts 
and techniques established in the section. They are written, whenever possible, 
using nontechnical language and mostly without notation. These summaries are 
not meant to give a recapitulation of the details and formulas of the section; 
rather they are designed to give an overview of the main ideas of the section. 
Our intention is to help students to make connections between the concepts of 
the section as they survey the topic from a greater vantage point. 


Applications 


Over the last few decades the applications of linear algebra have mushroomed, increas- 
ing not only in their numbers, but also in the diversity of fields to which they apply. 
Much of this growth is fueled by the power of modern computers and the availability 
of computer algebra systems used to carry out computations for problems involving 
large matrices. This impressive power has made linear algebra more relevant than 
ever. Recently, a consortium of mathematics educators has placed its importance, rel- 
ative to applications, second only to calculus. Increasingly, universities are offering 
courses in linear algebra that are specifically geared toward its applications. Whether 
the intended audience is engineering, economics, science, or mathematics students, 
the abstract theory is essential to understanding how linear algebra is applied. 

In this text our introduction to the applications of linear algebra begins in Sec. 1.8 
where we show how linear systems can be used to solve problems related to chemistry, 
engineering, economics, nutrition, and urban planning. However, many types of appli- 
cations involve the more sophisticated concepts we develop in the text. These appli- 
cations require the theoretical notions beyond the basic ideas of Chap. 1, and are 
presented at the end of a chapter as soon as the required background material is com- 
pleted. Naturally, we have had to limit the number of applications considered. It is our 
hope that the topics we have chosen will interest the reader and lead to further inquiry. 

Specifically, in Sec. 4.6, we discuss the role of linear algebra in computer graph- 
ics. An introduction to the connection between differential equations and linear algebra 
is given in Secs. 3.5 and 5.3. Markov chains and quadratic forms are examined in 
Secs. 5.4 and 6.7, respectively. Section 6.5 focuses on the problem of finding approx- 
imate solutions to inconsistent linear systems. One of the most familiar applications 
here is the problem of finding the equation of a line that best fits a set of data points. 
Finally, in Sec. 6.8 we consider the singular value decomposition of a matrix and its 
application to data compression. 


Technology 


Computations are an integral part of any introductory course in mathematics and 
certainly in linear algebra. To gain mastery of the techniques, we encourage the student 
to solve as many problems as possible by hand. That said, we also encourage the 
student to make appropriate use of the available technologies designed to facilitate, 
or to completely carry out, some of the more tedious computations. For example, it 
is quite reasonable to use a computer algebra system, such as MAPLE or MATLAB, 


xiv 


Preface 


to row-reduce a large matrix. Our approach in Introduction to Linear Algebra with 
Applications is to assume that some form of technology will be used, but leave the 
choice to the individual instructor and student. We do not think that it is necessary to 
include discussions or exercises that use particular software. Note that this text can be 
used with or without technology. The degree to which it is used is left to the discretion 
of the instructor. From our own experience, we have found that Scientific N otebook,™ 
which offers a front end for IATRX along with menu access to the computer algebra 
system MuPad, allows the student to gain experience using technology to carry out 
computations while learning to write clear mathematics. Another option is to use IATRX 
for writing mathematics and a computer algebra system to perform computations. 
Another aspect of technology in linear algebra has to do with the accuracy and 
efficiency of computations. Some applications, such as those related to Internet search 
engines, involve very large matrices which require extensive processing. Moreover, the 
accuracy of the results can be affected by computer roundoff error. For example, using 
the characteristic equation to find the eigenvalues of a large matrix is not feasible. 
Overcoming problems of this kind is extremely important. The field of study known as 
numerical linear algebra is an area of vibrant research for both software engineers and 
applied mathematicians who are concerned with developing practical solutions. In our 
text, the fundamental concepts of linear algebra are introduced using simple examples. 
However, students should be made aware of the computational difficulties that arise 
when extending these ideas beyond the small matrices used in the illustrations. 


Other Features 


1. Chapter openers: The opening remarks for each chapter describe an application 
that is directly related to the material in the chapter. These provide additional 
motivation and emphasize the relevance of the material that is about to be covered. 


2. Writing style: The writing style is clear, engaging, and easy to follow. Impor- 
tant new concepts are first introduced with examples to help develop the reader’s 
intuition. We limit the use of jargon and provide explanations that are as reader- 
friendly as possible. Every explanation is crafted with the student in mind. Intro- 
duction to Linear Algebra with Applications is specifically designed to be a 
readable text from which a student can learn the fundamental concepts in linear 
algebra. 


3. Exercise sets: Exercise sets are organized with routine exercises at the beginning 
and the more difficult problems toward the end. There is a mix of computational 
and theoretical exercises with some requiring proof. The early portion of each 
exercise set tests the student’s ability to apply the basic concepts. These exercises 
are primarily computational, and their solutions follow from the worked examples 
in the section. The latter portion of each exercise set extends the concepts and 
techniques by asking the student to construct complete arguments. 

4. Review exercise sets: The review exercise sets are organized as sample exams 
with 10 exercises. These exercises tend to have multiple parts, which connect 
the various techniques and concepts presented in the text. At least one problem 
in each of these sets presents a new idea in the context of the material of the 
chapter. 


Preface XV 


5. Length: The length of the text reflects the fact that it is specifically designed for 
a one-semester course in linear algebra at the undergraduate level. 


6. Appendix: The appendix contains background material on the algebra of sets, 
functions, techniques of proof, and mathematical induction. With this feature, the 
instructor is able to cover, as needed, topics that are typically included in a Bridge 
Course to higher mathematics. 


Course Outline 


The topics we have chosen for Introduction to Linear Algebra with Applications 
closely follow those commonly covered in a first introductory course. The order in 
which we present these topics reflects our approach and preferences for emphasis. 
Nevertheless, we have written the text to be flexible, allowing for some permutations 
of the order of topics without any loss of consistency. In Chap. 1 we present all the 
basic material on linear systems, matrix algebra, determinants, elementary matrices, 
and the LU decomposition. Chap. 2 is entirely devoted to a careful exposition of lin- 
ear combinations and linear independence in R”. We have found that many students 
have difficulty with these essential concepts. The addition of this chapter gives us 
the opportunity to develop all the important ideas in a familiar setting. As mentioned 
earlier, to emphasize the importance of abstract vector spaces, we have positioned 
their introduction as early as possible in Chap. 3. Also, in Chap. 3 is a discussion 
of subspaces, bases, and coordinates. Linear transformations between vector spaces 
are the subject of Chap. 4. We give descriptions of the null space and range of a 
linear transformation at the beginning of the chapter, and later we show that every 
finite dimensional vector space, of dimension n, is isomorphic to R”. Also, in Chap. 4 
we introduce the four fundamental subspaces of a matrix and discuss the action of an 
m x n matrix on a vector in R”. Chap. 5 is concerned with eigenvalues and eigenvec- 
tors. An abundance of examples are given to illustrate the techniques of computing 
eigenvalues and finding the corresponding eigenvectors. We discuss the algebraic and 
geometric multiplicities of eigenvalues and give criteria for when a square matrix is 
diagonalizable. In Chap. 6, using R” as a model, we show how a geometry can be 
defined on a vector space by means of an inner product. We also give a description 
of the Gram-Schmidt process used to find an orthonormal basis for an inner product 
space and present material on orthogonal complements. At the end of this chapter we 
discuss the singular value decomposition of an m x n matrix. The Appendix contains 
a brief summary of some topics found in a Bridge Course to higher mathematics. 
Here we include material on the algebra of sets, functions, techniques of proof, and 
mathematical induction. Application sections are placed at the end of chapters as soon 
as the requisite background material has been covered. 


Acknowledgments 


We would like to give our heartfelt thanks to the many individuals who reviewed 
the manuscript at various stages of its development. Their thoughtful comments and 
excellent suggestions have helped us enormously with our efforts to realize our vision 
of a reader-friendly introductory text on linear algebra. 


xvi 


Preface 


We would also like to give special thanks to David Meel of Bowling Green 
State University, Bowling Green, Ohio, for his thorough review of the manuscript and 
insightful comments that have improved the exposition of the material in the text. 
We are also grateful to Ernie Stitzinger of North Carolina State University who had 
the tiring task of checking the complete manuscript for accuracy, including all the 
exercises. A very special thanks goes to our editors (and facilitators), Liz Covello 
(Sr. Sponsoring Editor), Michelle Driscoll (Developmental Editor), and Joyce Watters 
(Project Manager) who have helped us in more ways than we can name, from the 
inception of this project to its completion. On a personal level, we would like to 
thank our wives, Regan DeFranza and Robin Gagliardi, for their love and support; 
and our students at Saint Lawrence University and SUNY Canton who provided the 
motivation to write the text. 


Preface xvii 


List of Reviewers 


Marie Aratari, Oakland Community College 

Cik Azizah, Universiti Utara Malaysia (UUM) 

Przemyslaw Bogacki, Old Dominion University 

Rita Chattopadhyay, Eastern Michigan University 

Eugene Don, Queens College 

Lou Giannini, Curtin University of Technology 

Gregory Gibson, North Carolina A&T University 

Mark Gockenback, Michigan Technological University 

Dr. Leong Wah June, Universiti Putra Malaysia 

Cerry Klein, University of Missouri—Columbia 

Kevin Knudson, Mississippi State University 

Hyungiun Ko, Yonsei University 

Jacob Kogan, University of Maryland—Baltimore County 
David Meel, Bowling Green State University 

Martin Nakashima, California State Poly University—Pomona 
Eugene Spiegel, University of Connecticut—Storrs 

Dr. Hajar Sulaiman, Universiti Sains Malaysia (USM) 

Gnana Bhaskar Tenali, Florida Institute of Technology—Melbourne 


xviii 


To The Student g 


You are probably taking this course early in your undergraduate studies after two or 
three semesters of calculus, and most likely in your second year. Like calculus, linear 
algebra is a subject with elegant theory and many diverse applications. However, in 
this course you will be exposed to abstraction at a much higher level. To help with 
this transition, some colleges and universities offer a Bridge Course to Higher 
Mathematics. If you have not already taken such a course, this may likely be the first 
mathematics course where you will be expected to read and understand proofs of 
theorems, provide proofs of results as part of the exercise sets, and apply the concepts 
presented. All this is in the context of a specific body of knowledge. If you approach 
this task with an open mind and a willingness to read the text, some parts perhaps more 
than once, it will be an exciting and rewarding experience. Whether you are taking this 
course as part of a mathematics major or because linear algebra is applied in your 
specific area of study, a clear understanding of the theory is essential for applying the 
concepts of linear algebra to mathematics or other fields of science. The solved 
examples and exercises in the text are designed to prepare you for the types of problems 
you can expect to see in this course and other more advanced courses in mathematics. 
The organization of the material is based on our philosophy that each topic should be 


fully developed before readers move onto the next. This learning strategy is particularly 


applicable to the study of mathematics. The material contained in Chaps. 1 through 4 
forms the basis for everything that comes afterward. All other more advanced topics 
and applications extend from the foundational material or from the ancillary material of 
the intervening chapters. We have specifically designed our text so that you can read it 
and learn the concepts of linear algebra in a sequential and thorough manner. If you 
remain committed to learning this beautiful subject, the rewards will be significant in 
other courses you may take, and in your professional career. Good luck! 


Jim DeFranza 
jdefranza@stlawu.edu 


Dan Gagliardi 
gagliardid @canton.edu 


Applications Index 


Aircraft Design, 199 

Astronomy, 61 

Average Global Temperatures, 371 
Balancing Chemical Equations, 1, 79, 84 
Carbon Emissions, 321 

Cellular Phone Subscribers, 86 
Computer Graphics, 199, 255 
Cumulative HIV Infections, 376 
Data Compression, 401 

Differential Equations, 185 

Digital Transmissions, 127, 402 
Economics, 82, 86, 87 

Electrical Networks, 88, 191 
Engineering, 185, 191 

Epidemics, 87, 315 

Exponential Growth and Decay, 186 
Fitting Data with a Curve, 11, 61, 366, 372, 376 
Fourier Approximation, 373 

Infant Mortality Rates, 376 

Least Squares Approximation, 321, 366 
Linear Regression, 371 

Markov Chains, 275, 310 

Mass Transit Systems, 315 

Mixing, 307, 310 

National Health Care, 86 

Networks, 79, 85, 89 

Nutrition, 81, 85 

Photosynthesis, 1 

Plant Breading, 315 

Population Distribution, 87, 275, 315 
Powers of a Matrix, 287 
Predator—Prey Models, 300 
Projections, 265, 343, 345, 361, 365 
Quadratic Forms, 385 

Rotation of Axes, 385 

Search Engines, 276 


xix 


XX 


Applications Index 


Signal Processing, 93 

Singular Value Decomposition, 392 
Systems of Differential Equations, 300 
Thermal Equilibrium, 88, 310 
Vibrating Spring, 192, 194 

World Hydroelectricity Use, 376 
World Population, 376 


CHAPTER 


Systems of Linear 
Equations and Matrices 


[ CHAPTER OUTLINE 1.1 Systems of Linear Equations 2 
1.2 Matrices and Elementary Row Operations 14 
1.3 Matrix Algebra 26 
1.4 The Inverse of a Square Matrix 39 
1.5 Matrix Equations 48 
1.66 Determinants 54 
1.7 Elementary Matrices and LU Factorization 68 
1.8 Applications of Systems of Linear Equations 79 


T the process of photosynthesis solar energy 
is converted into forms that are used by living 
organisms. The chemical reaction that occurs in 
the leaves of plants converts carbon dioxide and 
water to carbohydrates with the release of oxygen. 
The chemical equation of the reaction takes the 
form 


aCO2 + bH20 — cO + dC6H1206 


where a,b,c, and d are some positive whole 
numbers. The law of conservation of mass states 
that the total mass of all substances present before 
and after a chemical reaction remains the same. 
That is, atoms are neither created nor destroyed 
in a chemical reaction, so chemical equations must be balanced. To balance the pho- 
tosynthesis reaction equation, the same number of carbon atoms must appear on both 
sides of the equation, so 


Photograph by Jan Smith/RF 


a= 6d 
The same number of oxygen atoms must appear on both sides, so 
2a +b =2c + 6d 
and the same number of hydrogen atoms must appear on both sides, so 


2b = 12d 


Chapter 1 Systems of Linear Equations and Matrices 


1.1 > 


This gives us the system of three linear equations in four variables 


a — 6d=0 
2a+ b—2c— 6d=0 
2b —12d=0 


Any positive integers a,b,c, and d that satisfy all three equations are a solution to 
this system which balances the chemical equation. For example, a = 6, b = 6, c = 6, 
and d = 1 balances the equation. 


Many diverse applications are modeled by systems of equations. Systems of 
equations are also important in mathematics and in particular in linear algebra. In 
this chapter we develop systematic methods for solving systems of linear equations. 


Systems of Linear Equations 


As the introductory example illustrates, many naturally occurring processes are 
modeled using more than one equation and can require many equations in many vari- 
ables. For another example, models of the economy contain thousands of equations 
and thousands of variables. To develop this idea, consider the set of equations 


2x— y=2 
x+2y=6 


which is a system of two equations in the common variables x and y. A solution to 
this system consists of values for x and y that simultaneously satisfy each equation. 
In this example we proceed by solving the first equation for y, so that 


y=2x—-2 
To find the solution, substitute y = 2x — 2 into the second equation to obtain 
x+2(2x —2)=6 and solving for x gives XS 2 


Substituting x = 2 back into the first equation yields 2(2) — y = 2, so that y = 2. 
Therefore the unique solution to the system is x = 2, y = 2. Since both of these 
equations represent straight lines, a solution exists provided that the lines intersect. 
These lines intersect at the unique point (2, 2), as shown in Fig. l(a). A system of 
equations is consistent if there is at least one solution to the system. If there are no 
solutions, the system is inconsistent. In the case of systems of two linear equations 
with two variables, there are three possibilities: 


1. The two lines have different slopes and hence intersect at a unique point, as shown 
in Fig. 1(a). 

2. The two lines are identical (one equation is a nonzero multiple of the other), so 
there are infinitely many solutions, as shown in Fig. 1(b). 


DEFINITION 1 


1.1 Systems of Linear Equations 3 


ay ay ay 
= alt ee ca x+y=15+ s+ —x+y=1 
Í 2,2 I ii 
4, oi y | x-y=1 
pi f ES a ppc ay , i 4 > 
-5 { 5 x -5 L 5 x -5 5 x 
Lal zM 2x+2y=2 z 
(a) (b) (c) 
Figure 1 


3. The two lines are parallel (have the same slope) and do not intersect, so the 
system is inconsistent, as shown in Fig. 1(c). 


When we are dealing with many variables, the standard method of representing 
linear equations is to affix subscripts to coefficients and variables. A linear equation 
in the n variables x1, x2,..., Xn is an equation of the form 


axı + a2x2 +--+ + anXn = b 


To represent a system of m linear equations in n variables, two subscripts are used for 
each coefficient. The first subscript indicates the equation number while the second 
specifies the term of the equation. 


System of Linear Equations A system of m linear equations in n variables, 
or a linear system, is a collection of equations of the form 


A,X + a12X2 + -°° + ainXn = bı 
a2, X1 + anx +- + amnXn = b2 
a31X1 + a32X2 + `` + a3nXn = b3 


Am1X1 + Gm2X2 + +++ + AmnXn = bm 


This is also referred to as an m x n linear system. 


For example, the collection of equations 


—2x1 + 3x2 + x3 — x4=-—2 
X1 +x3 — 4x4= 1 
3x1 — XxX. — X= 3 


is a linear system of three equations in four variables, or a 3 x 4 linear system. 

A solution to a linear system with n variables is an ordered sequence 
(51, S2, -.., Sn) such that each equation is satisfied for x; = 51,x2 = 52,...,Xn = Sn- 
The general solution or solution set is the set of all possible solutions. 


Chapter 1 Systems of Linear Equations and Matrices 


DEFINITION 2 


The Elimination Method 


The elimination method, also called Gaussian elimination, is an algorithm used to 
solve linear systems. To describe this algorithm, we first introduce the triangular form 
of a linear system. 

An m xn linear system is in triangular form provided that the coefficients 
aij = 0 whenever i > j. In this case we refer to the linear system as a triangular 
system. Two examples of triangular systems are 


xı — 2x2 + x3= -1 xı +#x2— x3— x4=2 
x2 — 3x3 = 5 and X2 — x3-2x4=1 
x3= 2 2x3— x4=3 


When a linear system is in triangular form, then the solution set can be obtained 
using a technique called back substitution. To illustrate this technique, consider the 
linear system given by 


xı — 2x2 + x3 =-1 
x2 — 3x3 = 5 
X3 = 2 


From the last equation we see that x3 = 2. Substituting this into the second equation, 
we obtain x2 — 3(2) = 5, so x2 = 11. Finally, using these values in the first equation, 
we have xı — 2(11) + 2 = —1, so xı = 19. The solution is also written as (19, 11, 2). 


Equivalent Linear Systems Two linear systems are equivalent if they have 
the same solutions 


For example, the system 


xı — 2x2 + x3 =-—l1 
2x1 — 3x2 — X3 = 3 
xı — 2x2 + 2x3 = 1 


has the unique solution x; = 19, x2 = 11, and x3 = 2, so the linear systems 


xı — 2x2 + x3 =—1 xı — 2x2 + x3 =—1 
x2 — 3x3 = 5 and 2x, — 3x2 — x3= 3 
x3= 2 xı — 2x2 + 2x3 = 1 


are equivalent. 

The next theorem gives three operations that transform a linear system into an 
equivalent system, and together they can be used to convert any linear system to an 
equivalent system in triangular form. 


THEOREM 1 


1.1 Systems of Linear Equations 


Let 
aj1X1 + Ay2xX2 +++ + ainXn = bı 


azı X1 + A22X2 +-+: + amnXn = b2 
a31X1 + a32X2 +--+ + a3nXn = b3 


Am1X1 + Am2x2 + ` + © + AmnXn = bm 
be a linear system. Performing any one of the following operations on the linear 


system produces an equivalent linear system. 


1. Interchanging any two equations. 
2. Multiplying any equation by a nonzero constant. 
3. Adding a multiple of one equation to another. 


Proof Interchanging any two equations does not change the solution of the linear 
system and therefore yields an equivalent system. If equation i is multiplied by a 
constant c Æ 0, then equation i of the new system is 


Cai1X1 + Cai2X2 + +++ + CainXn = cb; 


Let (s1, 52,..., Sn) be a solution to the original system. Since 
ai1S1 + aj282 ++ +++ GinSn = bi, then Cai1S1 + Caj2S2 + +++ + CAjnSy = Cbi 
Hence (s1, 52,..., Sn) is a solution of the new linear system. Consequently, the 


systems are equivalent. 

For part (3) of the theorem, consider the new system obtained by adding c times 
equation į to equation j of the original system. Thus, equation j of the new system 
becomes 


(caii + aj1)x1 + (Caj2 + Gj2)X2 + +++ + (Cain + Ajn)Xn = cbi + bj 
or equivalently, 


C(Aj1X1 + ai2X2 + +++ + GinXn) + (Aji x1 + aj2xX2 + +++ + AjnXn) = cbi + bj 


Now let (s1, 52,..., Sn) be a solution for the original system. Then 
ai1S1 + aj282 + +++ + GinSy = bi and Aj1S1 + j282 + +++ + GjnSn = bj 
Therefore, 


C(aj181 + Gj282 + +++ + GinSn) + (Aj181 + Aj282 + +++ + AjnSn) = cbi + bj 


so that (51, 52,...,5,) is a solution of the modified system and the systems are 
equivalent. 


6 Chapter 1 Systems of Linear Equations and Matrices 


| EXAMPLE1 | Use the elimination method to solve the linear system. 


Solution 


Solution 


x+y=l 
=n y= ll 
Adding the first equation to the second gives the equivalent system 
gap y= 
2y S? 
From the second equation, we have y = 1. Using back substitution gives x = 0. 
The graphs of both systems are shown in Fig. 2. Notice that the solution is the same 


in both, but that adding the first equation to the second rotates the line —x + y = 1 
about the point of intersection. 


Ay ay 
x+y=1 54 a= x+y=1 5+ “ety aT 
iE T, 
1A(1, 0) Z wed 
k SSS oe er B. + A n E a S S 
=o ale 5 x —5 eo x 
=3 =f 
(a) (b) 
Figure 2 


Converting a linear system to triangular form often requires many steps. Moreover, 
the operations used to convert one linear system to another are not unique and may 
not be apparent on inspection. To articulate this process, the notation, for example, 


—2E, + £3 — E3 


will mean add —2 times equation 1 to equation 3, and replace equation 3 with the 
result. Equation E; remains unchanged. The notation E; <> E; will be used to indicate 
that equation 7 and equation j are interchanged. 


Solve the linear system. 
x+y+ 2= 4 
—x—y+ Z=—2 
Mea Vae2e= 2 


To convert the system to an equivalent triangular system, we first eliminate the 
variable x in the second and third equations to obtain 


1.1 Systems of Linear Equations 7 


x+y + Zoe E+ Ee Xb Yo ee 
-ayi (= 2 F = 
2x —y +2 ee i — 3y =—6 


Interchanging the second and third equations gives the triangular linear system 


xt Wa gS 4 x+ yt z= 4 
= 2 E © E3 —_— — 3y =—6 
— 3y = D= 2 


Using back substitution, we have z = 1, y = 2, and x = 4 — y — z = 1. There- 
fore, the system is consistent with the unique solution (1, 2, 1). 


Recall from solid geometry that the graph of an equation of the form 
ax + by + cz = d is a plane in three-dimensional space. Hence, the unique solution 
to the linear system of Example 2 is the point of intersection of three planes, as shown 
in Fig. 3(a). For another perspective on this, shown in Fig. 3(b) are the lines of the 
pairwise intersections of the three planes. These lines intersect at a point that is the 
solution to the 3 x 3 linear system. 


(a) (b) 
Figure 3 
Similar to the 2 x 2 case, the geometry of Euclidean space helps us better understand 
the possibilities for the general solution of a linear system of three equations in three 
variables. In particular, the linear system can have a unique solution if the three planes 
all intersect at a point, as illustrated by Example 2. Alternatively, a 3 x 3 system can 
have infinitely many solutions if 


1. The three planes are all the same. 

2. The three planes intersect in a line (like the pages of a book). 

3. Two of the planes are the same with a third plane intersecting them in a line. 
For example, the linear system given by 


—y+z=0 


8 Chapter 1 Systems of Linear Equations and Matrices 


represents three planes whose intersection is the x axis. That is, z = 0 is the xy plane, 
y =0 is the xz plane, and y = z is the plane that cuts through the x axis at a 45° 
angle. 

Finally, there are two cases in which a 3 x 3 linear system has no solutions. First, 
the linear system has no solutions if at least one of the planes is parallel to, but not 
the same as, the others. Certainly, when all three planes are parallel, the system has 
no solutions, as illustrated by the linear system 


z = 0 
rA = 
Z 2 
Figure 4 Also, a 3 x 3 linear system has no solutions, if the lines of the pairwise intersections 


of the planes are parallel, but not the same, as shown in Fig. 4. 

From the previous discussion we see that a 3 x 3 linear system, like a 2 x 2 linear 
system, has no solutions, has a unique solution, or has infinitely many solutions. We 
will see in Sec. 1.4 that this is the case for linear systems of any size. 

In Example 3 we consider a linear system with four variables. Of course the 
geometric reasoning above cannot be applied to the new situation directly, but provides 
the motivation for understanding the many possibilities for the solutions to linear 
systems with several variables. 


| EXAMPLE3 | Solve the linear system. 


4x, — 8x2 — 3x3 + 2x4 = 13 
3x, — 4x2 — x3 —3x4= 5 
2x1 — 4x2 — 2x3 + 2x4 = 6 


Solution Since every term of the third equation can be divided evenly by 2, we multiply the 
third equation by 5 After we do so, the coefficient of xı is 1. We then interchange 
the first and third equations, obtaining 


4x, — 8x2 — 3x3 + 2x4 = 13 4x) — 8x2 — 3x3 + 2x4 = 13 
3x, — 4x2 — x3 —3x4= 5 1E; > E3 —> 4 3x1 — 4x2 — x3 —3x4= 5 
2x, — 4x2 — 2x3 + 2x4 = 6 xı — 2x2 — x3 + x4= 3 


= 2an a tas 3 
Ei < Ez — 4 3x1 — 4x2 — x»3-—3x4= 5 
4x, — 8x2 — 3x3 + 2x4 = 13 


Next using the operations —3E1 + E2 —> E> and —4E, + E3 —> E3, we obtain 
the linear system 
xı — 2x2 — x3 + x4= 3 
2x2 + 2x3 — 6x4 = —4 
Be) 2x4 S] 


Solution 


1.1 Systems of Linear Equations 9 


which is an equivalent system in triangular form. Using back substitution, the gen- 
eral solution is 
X3 = Dee Sb Il Xo =x4—3 Xi — x 


with x4 free to assume any real number. It is common in this case to replace x4 
with the parameter t. The general solution can now be written as 


S = {(3t — 2,t — 3,2t+1,t) |t € R} 


and is called a one-parameter family of solutions. The reader can check that 
xı = 3t — 2, x2 = t — 3, x3 = 2t + 1, and x4 = t is a solution for any t by substi- 
tuting these values in the original equations. A particular solution can be obtained 
by letting t be a specific value. For example, if t = 0, then a particular solution is 
(—2, —3, 1, 0). 


In Example 3, the variable x4 can assume any real number, giving infinitely many 
solutions for the linear system. In this case we call x4 a free variable. When a linear 
system has infinitely many solutions, there can be more than one free variable. In this 
case, the solution set is an r-parameter family of solutions where r is equal to the 
number of free variables. 


Solve the linear system. 
Kian x2 — 2x3 — 2x4 — 2x5 = 3 
3x1 = 2x2 — 2x3 = 2x4 = 2x5 =-—1 
Siar wap hae wh Ay = ll 


After performing the operations £3 + E2 — E3 followed by E2 — 3E; > Ep, we 
have the equivalent system 


X1 — X2 — 2x3 — 2x4 — 2x5 = 3 
x2 + 4x3 + 4x4 + 4x5 = — 10 
NG ee X4 — 3x5 = — 2 


The variables x4 and x5 are both free variables, so to write the solution, let x4 = s 
and x5 = t. From the third equation, we have 

x3 = 2 — x4 — 3x5 =2—s — 3t 
Substitution into the second equation gives 


x2 = —10 — 4x3 — 4x4 — 4x5 
= NNSA ay = 3) = 4 = 4 
= —18 + 8 


10 Chapter 1 Systems of Linear Equations and Matrices 


Finally, substitution into the first equation gives 


xı = 3 + x2 + 2x3 + 2x4 + 2x5 
ee (alls) ae tet) a CE E 
=-—ll+4t 
The two-parameter solution set is therefore given by 


S = {(—11 + 4t, —18 + 8t, 2 — s — 3t, s, t) | s,t € R} 


Particular solutions for s = t = 0 and s = 0,t = 1 are (—11,—18,2,0,0) and 
(—7, —10, —1, 0, 1), respectively. 


| EXAMPLES | Solve the linear system. 


Xy— Xx + 2x3=5 
2x, + x2 =2 
xı +8x2— 2x3=3 
m E | — 5x2 — 12x3 = 4 


Solution To convert the linear system to an equivalent triangular system, we will eliminate 
the first terms in equations 2 through 4, and then the second terms in equations 
3 and 4, and then finally the third term in the fourth equation. This is accomplished 
by using the following operations. 
xı— Xx+ 2x3=5 L oF a B xı— Xot 2x3= 5 
2x1 + x2 S2 =E pee z 3x2 — 4x3 =- 8 
xı +8x2— x3=3 : 3 2 9x2 — 3x3 =-2 


—x; — Daa pases: oe — 6x) = l0x3= 9 
xı— x+ 2x3= 5 
—3F, + E; > E =e 3x2 — 4x3 = —8 
2E + E4, > E, She 7) 
— 18x3= —7 
X,—- x42+2%3= 5 
3x2 — 4 x3 = —8 
2E + E4 > E4 => Jos 2 
0 = -37 


The last equation of the final system is an impossibility, so the original linear system 
is inconsistent and has no solution. 


In the previous examples the algorithm for converting a linear system to triangular 
form is based on using a leading variable in an equation to eliminate the same variable 
in each equation below it. This process can always be used to convert any linear system 
to triangular form. 


1.1 Systems of Linear Equations 11 


| EXAMPLE 6 | Find the equation of the parabola that passes through the points (—1, 1), (2, —2), 


and (3, 1). Find the vertex of the parabola. 


Solution The general form of a parabola is given by y = ax? + bx + c. Conditions on a, b, 
and c are imposed by substituting the given points into this equation. This gives 


l=a(-1)*+b(/-1)+c= a— De 
—2=a(2)? +b(2) +c=4a+2b+c 
1=a(3)* +b) +c=9a+3b+c 
From these conditions we obtain the linear system 
a— b+c= 1 
4a+2b+c=-2 
9a+3b+c= 1 


First, with a as the leading variable, we use row operations to eliminate a from 
equations 2 and 3. In particular, we have 


a— bike = @— Wa €= il 
4a +2b+c=-2 on o o > 6b — 3c = —6 
9a+3b+c= 1 a e 12b — 8c = —8 
Next, with b as the leading variable we, eliminate b from equation 3, so that 
a— b = C= 1 a— b + c= 1 
6b — 3c = —6 —2E,+ E3 > E3 > 6b — 3c = —6 
12b — 8c = —8 —2c= 4 
Now, using back substitution on the last system gives c = —2,b = —2, anda=1. 
Thus, the parabola we seek is 
gi feted y =x? =o 
{ Completing the square gives the parabola in standard form 
as i i a 3 > y=(x-— D’ =8 
=5 L 5 x 
SUD with vertex (1, —3), as shown in Fig. 5. 


Figure 5 


Fact Summary 


1. An m x n linear system has a unique solution, infinitely many solutions, or 
no solutions. 


2. Interchanging any two equations in a linear system does not alter the set of 
solutions. 

3. Multiplying any equation in a linear system by a nonzero constant does not 
alter the set of solutions. 


12 Chapter 1 Systems of Linear Equations and Matrices 


4. Replacing an equation in a linear system with the sum of the equation and a 
scalar multiple of another equation does not alter the set of solutions. 


5. Every linear system can be reduced to an equivalent triangular linear 
system. 


Exercise Set 1.1 js 


Perform the operations — FE, + E2 — E> and 
—2E, + E3 > E3, and write the new equivalent 
system. Solve the linear system. 

In Exercises 5—18, solve the linear system using 
the elimination method. 


1. Consider the linear system 


Xj, — X2—- 2x3 =, 3 
—x, +2x74+3x3= 1 
2x1 = 2x2 = 2x3 =-2 


Perform the operations E; + E2 —> E> and 
—2E, + E3 — E3, and write the new equivalent 5 2x+3y=-2 
system. Solve the linear system. = 


2. Consider the linear system 


2x1 — 2x2 — x3 = —3 = y=-l 
Xx, — 3x2 + x3 = —2 
xı — 2x2 S2 


Perform the operations E1 < E>, 

—2 E; + Ey > Eo, —E, + £3 > E3, Eo < E3, 

and —4 E + E3 —> E3, and write the new 

equivalent system. Solve the linear system. 8. 


3. Consider the linear system 


3x— 2y=4 
X1 +3x4=2 9. ia 
xı + x2 + 4x4 =3 303 
2x1 + x3 + 8x4 =3 
xı + x2 + x3 + 6x4 =2 1 
10. 5 1 
RE ay a 


Perform the operations — FE, + E2 > Ep, 

—2E\+ £3 > £3, —E,; + Ey > Eq, Er 

E4 —> Eq, and — E3 + E4 —> E4, and write the 

new equivalent system. Solve the linear system. 11. 


— 3x — 2y + 2z = —2 
— x—3y+ z=-3 

x—2y+ z=-2 
4. Consider the linear system 


x1 + x3 =-—2 x+3y+ z= 2 
xy + x2 + 4x3 =-1 12. ¢ —2x+2y—4z=-1 
2x1 + 2x3 + x4 = —1 _ y+3z= 1 


—2x—2y+2z= 1 
x +5z=-1 
3x + 2y + 3z=—2 


13 


2x —2y—8z= 2 


3x, + 4x2 + 3x3 =0 


15 3x1 — 4x2 + 3x3 = 4 


—2x; + X2 =2 


16 3x1 — x2 + 2x3 = 1 


X1 2x2 2x3 X4 = 3 


17 — 2x + x2 + x3 —2x4=-—3 


—x+ y+4z=-1 
14. 3x—-— y+2z= 2 


2x1 + 2x2 — x3 =1 
{ = x2 +3x4=2 


In Exercises 19—22, solve for x, y, and z in terms of 
a, b, and c. 


3x+2y=b 
2x+3y=a 
20. { x+ y=b 
3x+ y+3z=a 
21. < -x- z=b 
—x+2y =c 


—3x+2y+ z=a 
22; x— y— z=b 
x— y-—2z=c 


In Exercises 23—28, give restrictions on a, b, and c 
such that the linear system is consistent. 


A 
A e 
Ea 
a (eee 


1.1 Systems of Linear Equations 13 


x—2y+4z=a 
27. $ 2x+ y— z=b 
3x— y+3z=c 
x— y+2z=a 
28. ¢ 2x+4y—3z=b 
4x+2y+ z=c 
In Exercises 29—32, determine the value of a that 
makes the system inconsistent. 


29. { EE Meee 


2x+ay= 3 
se prea 
Drea 
A 


In Exercises 33—36, find an equation in the form 
y = ax? + bx + c for the parabola that passes through 
the three points. Find the vertex of the parabola. 


33. (0, 0.25), (1, —1.75), (—1, 4.25) 

34. (0, 2), (—3, —1), (0.5, 0.75) 

35. (—0.5, —3.25), (1, 2), (2.3, 2.91) 

36. (0, —2875), (1, —5675), (3, 5525) 

37. Find the point where the three lines 
=x + y = 1, —6x + 5y = 3, and 12x + 5y = 39 
intersect. Sketch the lines. 

38. Find the point where the four lines 
2x+y=0,x+y=-1,3x+y=1, and 
4x + y = 2 intersect. Sketch the lines. 

39. Give an example of a 2 x 2 linear system that 
a. Has a unique solution 
b. Has infinitely many solutions 
c. Is inconsistent 


40. Verify that if ad — bc Æ 0, then the system of 
equations 
ax + by =x; 
{ cx +dy =x 
has a unique solution. 


14 Chapter 1 Systems of Linear Equations and Matrices 


41. Consider the system 43. Determine the values of k such that the linear 
system 
xı — x2 + 3x3 — x4=1 9x +ky= 9 
x2— x3+2x4=2 kx+ y=-3 
a. Describe the solution set where the variables x3 has 
and x4 are free. a. No solutions 
b. Describe the solution set where the variables x2 b. Infinitely many solutions 
and x4 are free. c. A unique solution 
42. Consider the system 44. Determine the values of k such that the linear 
system 
Xy—X2+x3- Xat x5= 1 kx+ y+ z=0 
x2 — x4— x5=-1 x+ky+ z=0 
x3—2x4+3x5= 2 x+ y+kz=0 
a. Describe the solution set where the variables x4 has 
and xs are free. a. A unique solution 
b. Describe the solution set where the variables x3 b. A one-parameter family of solutions 
and x5 are free. c. A two-parameter family of solutions 


1.2 > Matrices and Elementary Row Operations 


In Sec. 1.1 we saw that converting a linear system to an equivalent triangular system 
provides an algorithm for solving the linear system. The algorithm can be streamlined 
by introducing matrices to represent linear systems. 


DEFINITION 1 Matrix An m x n matrix is an array of numbers with m rows and n columns. 


For example, the array of numbers 


as ae | 4 
3 1 0 -2 
—2 4 1 3 


is a 3 x 4 matrix. 

When solving a linear system by the elimination method, only the coefficients of 
the variables and the constants on the right-hand side are needed to find the solution. 
The variables are placeholders. Utilizing the structure of a matrix, we can record the 
coefficients and the constants by using the columns as placeholders for the variables. 
For example, the coefficients and constants of the linear system 


—4xı + 2x2 — 3x4 = 11 
2x, — x2 — 4x3 + 2x4 =— 3 
3x2 — X= 0 


—2x] + x= 4 


1.2 Matrices and Elementary Row Operations 15 


can be recorded in matrix form as 


—2 0 0 1 4 


This matrix is called the augmented matrix of the linear system. Notice that for an 
m xn linear system the augmented matrix is m x (n + 1). The augmented matrix 
with the last column deleted 


-4 2 0 -3 
en ae ES 
0 3 0 SI 

-2 0 0 1 


is called the coefficient matrix. Notice that we always use a 0 to record any missing 
terms. 

The method of elimination on a linear system is equivalent to performing similar 
operations on the rows of the corresponding augmented matrix. The relationship is 
illustrated below: 


Linear system Corresponding augmented matrix 
x+ty-— z= 1 1 1 -=l 1 
2x— y+ z=-l 2 -1 1|-1 
—x-—yt+3z= 2 -1 -1 3 2 


Using the operations —2E, + E2 —> E2 | Using the operations —2R; + R2 > R2 
and E; + E3 — E3, we obtain the equiv- | and Rj + R3 — R3, we obtain the equiv- 


alent triangular system alent augmented matrix 
x+ y—- z= 1 1 1 -1 1 
— 3y + 3z = —3 0 =3 3j|-3 
0 0 2 3 


2z= 3 


The notation used to describe the operations on an augmented matrix is similar 
to the notation we introduced for equations. In the example above, 


—2R, + Ro — Ro 


means replace row 2 with —2 times row I plus row 2. Analogous to the triangular 
form of a linear system, a matrix is in triangular form provided that the first nonzero 
entry for each row of the matrix is to the right of the first nonzero entry in the row 
above it. 

The next theorem is a restatement of Theorem | of Sec. 1.1, in terms of operations 
on the rows of an augmented matrix. 


16 Chapter 1 Systems of Linear Equations and Matrices 


THEOREM 2 


Solution 


Any one of the following operations performed on the augmented matrix, corre- 
sponding to a linear system, produces an augmented matrix corresponding to an 
equivalent linear system. 


1. Interchanging any two rows. 
2. Multiplying any row by a nonzero constant. 
3. Adding a multiple of one row to another. 


Solving Linear Systems with Augmented Matrices 


The operations in Theorem 2 are called row operations. An m x n matrix A is called 
row equivalent to an m x n matrix B if B can be obtained from A by a sequence of 
row operations. 

The following steps summarize a process for solving a linear system. 


. Write the augmented matrix of the linear system. 
. Use row operations to reduce the augmented matrix to triangular form. 
. Interpret the final matrix as a linear system (which is equivalent to the original). 


bh wn = 


. Use back substitution to write the solution. 


Example 1 illustrates how we can carry out steps 3 and 4. 


Given the augmented matrix, find the solution of the corresponding linear system. 


1 0 OJI r@® OS MW 2 1 -1}1 
a} 0 1 0] 2 sy @ tw sil @| 1 Sim 3 =i Of 1 
0 0 1/13 OO il || 3 00 0 0O 


a. Reading directly from the augmented matrix, we have x3 = 3, x2 = 2, and 
xı = 1. So the system is consistent and has a unique solution. 


b. In this case the solution to the linear system is x4 = 3,x2 = l1 + x3, 
and xı = 5. So the variable x3 is free, and the general solution is 
S={(5,1+t,t,3)|te R} 

c. The augmented matrix is equivalent to the linear system 

xy + 2x2 + x3 — x4 = 1 
3x2 — X3 =] 
Using back substitution, we have 


1 IES 
y= U a) and 4 = a ee 


Solution 


1.2 Matrices and Elementary Row Operations 17 


So the variables x3 and x4 are free, and the two-parameter solution set is given by 


TESS, Mg 
SS -— — +t,- -,S,t bob: R 
{(3 a TECDOC E \ 


Write the augmented matrix and solve the linear system. 
x= 6y=4 = 3; 
ee = Oy = Dz = =4 
= ar OFA 3 


To solve this system, we write the augmented matrix 


1 6 aj- 
a 6-10 =m 
Zi 6 5| 3 


where we have shaded the entries to eliminate. Using the procedure described above, 
the augmented matrix is reduced to triangular form as follows: 
1 -6 -—4]/-5 1 -6 -—4]—-5 
2 =10 E ee — |o 220i 
-1 6 53 Be = 0 oO 1/-2 
The equivalent triangular linear system is 


x —6y —4z7=-—5 


A= z= © 
g==2 
which has the solution x = —1, y = 2, and z = —2. 


Echelon Form of a Matrix 
In Example 2, the final augmented matrix 


1 6 4) 35 
0 2 -ıl 6 
o o Al 


is in row echelon form. The general structure of a matrix in row echelon form is 
shown in Fig. 1. The height of each step is one row, and the first nonzero term in a 
row, denoted in Fig. | by *, is to the right of the first nonzero term in the previous 
row. All the terms below the stairs are 0. 


18 


Chapter 1 


Systems of Linear Equations and Matrices 


0-0] * 


* 


0 


Figure 1 


Although, the height of each step in Fig. 1 is one row, a step may extend over 
several columns. The leading nonzero term in each row is called a pivot element. The 
matrix is in reduced row echelon form if, in addition, each pivot is a 1 and all other 
entries in this column are 0. For example, the reduced row echelon form of the matrix 


1 —6 —4 -5 
0 2—1 6 
0 0 1 -2 


is given by 
100 -1 
0 10 2 
0 0 1 -2 


Transforming a matrix in row echelon form to reduced row echelon form in effect 
incorporates back substitution as row operations on the matrix. If we read from 
the last matrix above, the solution to the corresponding linear system is, as before, 
x = —1, y = 2, and z = -2. 

Here are three additional matrices that are in reduced row echelon form 


1 0 0 -I1 : o : 1 —2 0 1 -l 
0 1 0 2 0010 0 0 1 -l1 2 
0 0 1 4 0001 0 0 0 0 
and two that are not in reduced row echelon form 
1 3 2 1 1 —2 1 0 
0 1 —5 6 0 0 2 1 
0 1 4 1 0 0 0 3 


In general, for any m x n matrix in reduced row echelon form, the pivot entries 
correspond to dependent variables, and the nonpivot entries correspond to independent 
or free variables. We summarize the previous discussion on row echelon form in the 
next definition. 


1.2 Matrices and Elementary Row Operations 19 


DEFINITION 2 Echelon Form An m x n matrix is in row echelon form if 


1. Every row with all 0 entries is below every row with nonzero entries. 


2. If rows 1,2,...,k are the rows with nonzero entries and if the leading 
nonzero entry (pivot) in row i occurs in column c;, for 1,2,...,k, then 
Cy < C2 < +++ < Cg. 


The matrix is in reduced row echelon form if, in addition, 
3. The first nonzero entry of each row is a 1. 
4. Each column that contains a pivot has all other entries 0. 


The process of transforming a matrix to reduced row echelon form is called 
Gauss-Jordan elimination. 


| EXAMPLE3 | Solve the linear system by transforming the augmented matrix to reduced row 


echelon form. 
xi — X%2—-2x%34+ x= 0 
2x, — x2 — 3x3 + 2x4 = —6 
nran araa 2 


xı + x2— x3 +2x4= 1 
Solution The augmented matrix of the linear system is 
1 -1 —2 1 0 
2 -1 -3 2] -6 
-1 2 13 2 


E Zi il 


To transform the matrix into reduced row echelon form, we first use the leading 1 
in row | as a pivot to eliminate the terms in column 1 of rows 2, 3, and 4. To do 
this, we use the three row operations 


—2R, a= Ro are Ro 


Ri + R3 > R3 
—R, + R4 > R4 
in succession, transforming the matrix 
1 —1 —2 1 0 1 —1 —2 1 (0) 
2 -1 —3 2]-6 i (0) 1 1 0—6 
-1 2 e S 0 ai =e 
1 1 -1 2 1 0 P i il 1 


For the second step we use the leftmost 1 in row 2 as the pivot and eliminate 
the term in column 2 above the pivot, and the two terms below the pivot. The 
required row operations are 

Ro +R, > Rı 
—R) + R3 = R3 
—2R> == R4 =e R4 


Chapter 1 Systems of Linear Equations and Matrices 


reducing the matrix 


fei =2 Tl 0 i @ = | =6 
0 1 1 0| -6 a O 1 il || =6 
0 E =i 4 © O =2 4 8 
or? 1 1 1 oO @ —! | ie 


Notice that each entry in row 3 is evenly divisible by 2. Therefore, a leading 1 in 
row 3 is obtained using the operation —5R3 — R3, which results in the matrix 


O 1 1 0 | —6 
0 0 l = 2 || =4 
0 0 =i || 3 


Now, by using the leading 1 in row 3 as a pivot, the operations 


R3 +R, > Rı 

—R3+ Ro > Ro 

R3 + R4 > R4 

row-reduce the matrix 

1 0 —1 1ļ]—6 1 0 0 -1) -10 
0 1 1 0 | —6 0 1 0 2) —2 
0 0 1 —2 | —4 0 0 1 -2;| —4 
0 0 -!1 i || 13 0 0 0 -=I 9 


Using the operation — R4 — R4, we change the signs of the entries in row 4 to 


obtain the matrix 
(A 00 =i | 


Ol @ Bil —2 
 @ i =] =4 
0 0 0 | = 


Finally, using the leading 1 in row 4 as the pivot, we eliminate the terms above it 
in column 4. Specifically, the operations 
R4 + Ri > Ri 
—2R4+ Ro > Ro 
2R4+ R3 > R3 


applied to the last matrix give 


1 0 0 0}; -19 
0 1 0 0 16 
OO it O i =22 
0 0 0 1]; -9 


which is in reduced row echelon form. 
The solution can now be read directly from the reduced matrix, giving us 


xı = —19 x2 = 16 x3 = —22 and x4 = —9. 


1.2 Matrices and Elementary Row Operations 21 


| «EXAMPLE 4 | Solve the linear system. 


3x1 — Xo + x3 + 2x4 = —2 
xı +2x2— x3 + x4= 1 
— x, — 3x2 + 2x3 — 4x4 = —6 


Solution The linear system in matrix form is 


3 —1 o= 
1 2 = 
C’ l 


which can be reduced to 


00 1 AaS 


Notice that the system has infinitely many solutions, since from the last row we 
see that the variable x4 is a free variable. We can reduce the matrix further, but the 
solution can easily be found from the echelon form by back substitution, giving us 
20 
x3 = —10 + ma 
20 11 
x2 = 5 + x3 — 3x4 = 5 + | —10 + a = 3x4 = —-S5+ a 
x1 1-2 en 


Letting x4 be the arbitrary parameter t, we see the general solution is 


St liz 20t 
s={ (1-5-54 51042) rer} 


3 3 3 
Example 5 gives an illustration of a reduced matrix for an inconsistent linear 
system. 


| EXAMPLES | Solve the linear system. 


x+ y+ z=4 
I= y= g=? 
Sap hy ae oe =R 


Solution To solve this system, we reduce the augmented matrix to triangular form. The 
following steps describe the process. 


1 1 ja 1 ot aj 4 
7 <1 Se -INER — | 0 aE 
1 3 318 —R, + R3 > R3 


22 Chapter 1 Systems of Linear Equations and Matrices 


Figure 2 


Solution 


1 1 1 4 
5R2>R, —|0 1 1 2 
0 —4 —4|-—10 
toil 4 
4R: + Rz > R — |O 1 1 2 
0 0 OE 
The third row of the last matrix corresponds to the equation 0 = —2. As this 


system has no solution, the system is inconsistent. This can also be seen from the 
fact that the three planes do not have a common intersection, as shown in Fig.2. 


In Example 5, each entry in the last row of the reduced coefficient matrix is 0, 
but the constant term is nonzero and the linear system is inconsistent. The reduced 
augmented matrix for a consistent linear system can have a row of zeros. However, in 


this case the term in the last column of this row must also be zero. Example 6 gives 
an illustration. 


Determine when the augmented matrix represents a consistent linear system. 


1 0 2/a 
2 1 5ļb 
1 =i Ee 


The operation —2Rı + R2 > Ra followed by — Rı + R3 —> R; and finally followed 
by Ro + R3 — R3 reduces the augmented matrix to 


il @ 2 a 
O11 b—2a 
0 0 0} b+c—3a 


Hence, the corresponding linear system is consistent provided that b + c — 3a = 0. 
That is, the system is consistent for all a, b, and c such that the point (a, b, c) lies 
on the plane b + c — 3a = 0. Notice also that when the system is consistent, the 
third row will contain all zeros and the variable x3 is a free variable. 


Shown in the following list is an outline that summarizes the process for trans- 
forming a matrix to its equivalent reduced row echelon form. 


1. If necessary, interchange rows so that the leading nonzero entry of row 1 is the 


leftmost nonzero entry of the matrix. Then divide each entry of row 1 by the 
leading entry. 


1.2 Matrices and Elementary Row Operations 23 


2. Eliminate all other nonzero terms in this leading column. 


3. Repeat the first two steps, starting with row 2. Note that the leading entry may 
not be in column 2. 


4. Continue in this way, making sure that the leading entry of each row is a 1 with 
zeros elsewhere in that column. 


5. The leading | in any row should be to the right of a leading | in the row above it. 
6. All rows of zeros are placed at the bottom of the matrix. 
We have implicitly assumed in our discussion that every matrix is row equivalent 


to exactly one matrix in reduced row echelon form. It is an important fact that we 
will state here as a theorem without proof. 


THEOREM 3 The reduced row echelon form of every matrix is unique. 


Fact Summary 


Altering an augmented matrix by interchanging two rows, or multiplying a 
row by a nonzero constant, or replacing a row with the sum of the same 
row and a scalar multiple of another row does not alter the set of solutions 
of the corresponding linear system. 

If an augmented matrix is row-reduced to triangular form, the coefficient 
matrix has a row of zeros, and the corresponding augmented term is not 
zero, then the linear system has no solutions. 


Every matrix has a unique reduced row echelon form. 


If the augmented matrix of an n x n linear system is row-reduced to 
triangular form and the coefficient matrix has no rows of zeros, then the 
linear system has a unique solution. 


If the augmented matrix of an n x n linear system is row-reduced to 
triangular form, the coefficient matrix has rows of zeros, and each 
corresponding augmented term is 0, then the linear system has infinitely 
many solutions. 


Exercise Set 1.2 


In Exercises 1-8, write the linear system 2x —2y=1 
as an augmented matrix. Do not solve the 2. 3x =1 
system. 
2x —z=4 
e 3.4 x+4y+z=2 
IENS 4x+ y-z=l 


24 Chapter 1 Systems of Linear Equations and Matrices 


—3x+ y+ z=2 1 0/0 
—4z=0 15. | 0 1 00 
—4x +2y —3z=1 0 0 O}1 
5. 2x1 —x3=4 1 0 0/0 
X] + 4x2 + X3 J 16. 0 0 1 0 
0 0 01 
6. 4x, + x2 -4x3 = 1 w 1 0 —2 53 
4x, — 4x2 + 2x3 = —2 “)0 1 -1 2)2 
A 18. E ` F ap 
2x2 3x3 2x4 = 2 
a a a 1 0 —3 1 
19. |} 0 1 0 -=I 7 
3x1 — 3x3 + 4x4 = —3 0 0 1 2| —1 
. < —4 2x2 — 2x3 — 4x4 = 4 
: a ee E Oat 
Bee creer 20.10 1 EEG | 1 
In Exercises 9—20, write the solution of the linear 0 0 0 1 3 
system corresponding to the reduced augmented In Exercises 21—28, determine whether the matrices 
matrix. 


are in reduced row echelon form. 


1 0 2 
MEEI 


Oo 
or © 
Oo 


2 0 
22. 1 


1 
0 0 


| 
et 
| 


10. 


© 

— 

=) 
SN oO N 
= N 


- © 
© 
© 
N 


12. 25. 


O wu 
Doua A 
© 
= 
— 
| 
N 


13. 0 1 2 26. 


© 
O =- © 

— 

n 


14. 


© 
© 
ore 


| 

| 
Hace | 

| HE 

| | 

| | 


5 
0 0 27. 
0 


1.2 Matrices and Elementary Row Operations 25 


110 4 ¢ 2x —4z=1 
28. | 0 1 1 5 6 40. < 4x+3y—2z=0 
0001 4 2x +2z=2 
In Exercises 29—36, find the reduced row echelon x+2y+ z=1 
form of the matrix. 41. < 2x +3y+2z7=0 
23 x+ y+ z=2 
TEE 
3x — 2z = —3 
5 E 2 | a fas 2. 
: 3 3 = 
3 3 1 43. e + 2x2 + 3x3 = —3 
31. | 3 -1 0 xı +2x2— x3=-2 
S ee { — 3x. — X3 = 2 
44. 
0 2 1 X1 + x3 = —2 
32. 1 —3 -3 
$ 2 3 [= +3x3 +x4= 2 
45. 2x1 + 3x2 — 3x3 + x4 = 2 
33. | E 1 5 | 2x1 2x2 2x3 X4 = 2 
—3x1 — Xo+ 3x3 + 3x4 =-3 
34 E —2 -l1 “f Xp— x+ x+ x4= 3 
"| —2 -3 0 —3x, +3x2 — x3 +2x4= 1 
—2 2 —l1 2 3x1 — 3x2 + x3 + 3x4 = —3 
35. 0 3 3 -3 47. xi + xX2— x3 —2x4= 3 
1 —4 2 2 4xı — 2x2 + x4= 0 
4 -3 —4 -2 —3x1 +2x2 — x3 — 2x4 = 2 
36. =4 2 1 —4 48. Xi, = X2 = 3x4 =3 
—1 -3 1 —4 4x, — 3x2 + X3 — x4=1 
In Exercises 37—48, write the linear system as an 49. The augmented matrix of a linear system has the 
augmented matrix. Convert the augmented matrix to form 
reduced row echelon form, and find the solution of the 1 2 =la 
linear system. 2 3 —2|b 
—1 -l 1 
37. { x+ y=l 
4x+3y=2 n : 
a. Determine the values of a, b, and c for which 
38 3x+ y=l1 the linear system is consistent. 
` { 4x +2y=0 b. Determine the values of a, b, and c for which 
the linear system is inconsistent. 
aay = 3 c. When it is consistent, does the linear system 
39. 4x— y—-3z= 3 h . luti infinitel 
riy 3 a ave a unique solution or infinitely many 


solutions? 


26 


Chapter 1 


d. Give a specific consistent linear system and 
find one particular solution. 


50. The augmented matrix of a linear system has the 


form 


a 1 1 
2 (a-1)]1 


a. Determine the values of a for which the linear 
system is consistent. 

b. When it is consistent, does the linear system 
have a unique solution or infinitely many 
solutions? 

c. Give a specific consistent linear system and 
find one particular solution. 


1.3 > Matrix Algebra 


Systems of Linear Equations and Matrices 


51. 


52. 


The augmented matrix of a linear system has the 
form 


—2 3 lja 
1 1 —liib 
0 5 —ilc 


a. Determine the values of a, b, and c for which 
the linear system is consistent. 

b. Determine the values of a, b, and c for which 
the linear system is inconsistent. 

c. When it is consistent, does the linear system 
have a unique solution or infinitely many 
solutions? 

d. Give a specific consistent linear system and 
find one particular solution. 


Give examples to describe all 2 x 2 reduced row 
echelon matrices. 


Mathematics deals with abstractions that are based on natural concepts in concrete 
settings. For example, we accept the use of numbers and all the algebraic properties 
that go with them. Numbers can be added and multiplied, and they have properties 
such as the distributive and associative properties. In some ways matrices can be 
treated as numbers. For example, we can define addition and multiplication so that 
algebra can be performed with matrices. This extends the application of matrices 
beyond just a means for representing a linear system. 

Let A be an m x n matrix. Then each entry of A can be uniquely specified by 
using the row and column indices of its location, as shown in Fig. 1. 


Row i —> 


For example, if 


Column j 
4 
11 se aij sae din 
Git > Gij >> Gin | =A 
Gm dmj amn 
Figure 1 
—2 1 4 
A= 5 7 ll 


2 3 22 


DEFINITION 1 


1.3 Matrix Algebra 27 


then 
ayy = —2 412 = 1 413 = 4 
a21) = 5 an = 7 az = 11 
a3, = 2, a32 = 3 a33 = 22 


A vector is an n x 1 matrix. The entries of a vector are called its components. 
For a given matrix A, it is convenient to refer to its row vectors and its column 
vectors. For example, let 


1 2 -l 
A=} 3 0 1 
4 -1 2 
Then the column vectors of A are 
1 2 —1 
3 0 and 


while the row vectors of A, written vertically, are 


1 3 4 
and —1 


- © 


-1 


Two m x n matrices A and B are equal if they have the same number of rows 
and columns and their corresponding entries are equal. Thus, A = B if and only if 
aij = bij, for 1 <i < mand 1 < j <n. Addition and scalar multiplication of matrices 
are also defined componentwise. 


Addition and Scalar Multiplication If A and B are two m x n matrices, 
then the sum of the matrices A + B is the m x n matrix with the ij term given by 
aij + bij. The scalar product of the matrix A with the real number c, denoted by 
cA, is the m x n matrix with the ij term given by caj;. 


Perform the operations on the matrices 


a 0 1 —2 3 -!l 
A= 4 3 -1l and Be 3 5 6 
—3 6 5 A 2) 1 
a. A+B b. 2A — 3B 


28 


Chapter 1 Systems of Linear Equations and Matrices 


Solution 


THEOREM 4 


a. We add the two matrices by adding their corresponding entries, so that 


2 0 1 =2 3 =l 
A+B= 43 3 5 6 
=3 6 5 uD) 1 


2+ (2) O eee 
443 3365) Stee 
34 Gt? Sepa 


0 3 0 
=| 7 & 3 
1 8 6 


b. To evaluate this expression, we first multiply each entry of the matrix A by 2 
and each entry of the matrix B by —3. Then we add the resulting matrices. 
This gives 


230 i W a =i 
2A+(-3B)=2) 4 3 =I) ea) eee 
=3 6) 5 4 eel 
a0 2 6 =O 
=| 8 6 —2 IFI 
—6 12 10 ED -3 
10° =9 5 
—| =1 e 
_18 6o iE 


In Example 1(a) reversing the order of the addition of the matrices gives the 
same result. That is, A + B = B + A. This is so because addition of real numbers 
is commutative. This result holds in general, giving us that matrix addition is also a 
commutative operation. Some other familiar properties that hold for real numbers also 
hold for matrices and scalars. These properties are given in Theorem 4. 


Properties of Matrix Addition and Scalar Multiplication Let A, B, and 

C be m x n matrices and c and d be real numbers. 

A+B=B+A 

. A+(B+C)=(A+B)+C 

c(A+B)=cA+cB 

~(c+adA=cA+dA 

. c(dA) = (cd)A 

. The m x n matrix with all zero entries, denoted by 0, is such that A + 0 = 
O+A=A. 

7. For any matrix A, the matrix —A, whose components are the negative of each 

component of A, is such that A+ (—A) = (-A) + A =Q. 


An kB wWwWN = 


DEFINITION 2 


1.3 Matrix Algebra 29 


Proof In each case it is sufficient to show that the column vectors of the two 
matrices agree. We will prove property (2) and leave the others as exercises. 

(2) Since the matrices A, B, and C have the same size, the sums (A + B) + C 
and A + (B + C) are defined and also have the same size. Let A;, B;, and C; 
denote the ith column vector of A, B, and C, respectively. Then 


aj bii Cli 
(A; + Bi) + C; = ae ee T 
Ami bmi Cmi 
aii + bii Chi (ay; + bii) + Cii 
= : + : = 
Ami + Omi Cmi (ami + bmi) + Cmi 


Since the components are real numbers, where the associative property of addition 
holds, we have 
(ari + bii) + cii 
(A; + B;) + C; = : 
(ami + bmi) + Cmi 
ay + (bii + ci) 
: = A; + (B; + C;) 


Ami + (bmi =e Cmi) 


As this holds for every column vector, the matrices (A + B) + C and A+ (B+ C) 
are equal, and we have (A + B)+C=A+(B4+0C). 


Matrix Multiplication 


We have defined matrix addition and a scalar multiplication, and we observed that 
these operations satisfy many of the analogous properties for real numbers. We have 
not yet considered the product of two matrices. Matrix multiplication is more difficult 
to define and is developed from the dot product of two vectors. 


Dot Product of Vectors Given two vectors 


uj vı 
u2 v2 
u= and v= . 
Un Un 


the dot product is defined by 


Us V = UV + UV + +- F UnYVn = y Uj VU; 


i=l 


30 Chapter 1 Systems of Linear Equations and Matrices 


Observe that the dot product of two vectors is a scalar. For example, 


2 —5 
aie 1 | = (2)(-5) + (—3)(1) + (-1)(4) = -17 
=i 4 


Now to motivate the concept and need for matrix multiplication we first introduce 
the operation of multiplying a vector by a matrix. As an illustration let 


[24] = e[s 


The product of B and v, denoted by By, is a vector, in this case with two components. 
The first component of Bv is the dot product of the first row vector of B with v, 
while the second component is the dot product of the second row vector of B with v, 


so that 
By= 1 -l 1} | D Sa a S T a } _ | -2 
-a ee 1 3) | EDD | — 1 


; : : À 1 
Using this operation, the matrix B transforms the vector v = | 3 to the vector 


Bv = a .IfA= | = 1 | is another matrix, then the product of A and Bv 
is given by 


aao=[-9 7][ 7] -[4] 


The question then arises, is there a single matrix which can be used to transform 


oe 1 4 f ; 
the original vector 3 to 1 ? To answer this question, let 


Bean | Aa| M1 an and opal Ou OL 
y an an ba ba 
The product of B and v is 
gy = | uxt bry 
baix + bry 
Now, the product of A and Bv is 


a an | bix + bi2y 


AE | an an baix + bay 


ayy (bix + bi2y) + ay2(baix + bry) | 
azı (b,x + bi2y) + az (by x + byy) 


— | (abi +anb)x + (a11bi2 + a12b22)y 
(a21b11 + a22b21)x + (a21b12 + ax2br) y 


aby, +ai2b21 anbi + anba | X 


abii +azb2n azb + a22b22 y 


1.3 Matrix Algebra 31 


Thus, we see that A(Bv) is the product of the matrix 


aiibi + ai2b21 anb + anba 
anbi + anb anb + anbz 


and the vector | j } We refer to this matrix as the product of A and B, denoted by 


AB, so that 
A(Bv) = (AB)v 


See Fig. 2. 
AB 
E a a 
54 poe IT e 5F 
tay L 1 
1 Bv, Í T ABy 
D a ee E 
3 4 5 L x 5 J. 5 X 
l a a 
Figure 2 


Notice that the product matrix AB is obtained by computing the dot product of 
each row vector of the matrix A, on the left, with each column vector of the matrix 
B, on the right. Using the matrices A and B given above, we have 


ai 2 1 —1 
ARE | 0 1 | © ij | 
E | CDO) ++ DG? (-)(-D+ @d) | = | -5 3 
~ (0)(1) + (1)(—2) (0)(-1)+()d) | | -2 1 


This matrix transforms the vector | : | to | i | in one step. That is, 


ER EJL 


which was our original aim. The notion of matrices as transformations is taken up 
again in Chap. 4 where we consider more general transformations of vectors. 
For another illustration of the matrix product let 


1 3 0 3 -2 5 
A= 2 1 -—3 and B= —1 4 -—2 
—4 6 2 1 0 3 


The entries across the first row of the product matrix AB are obtained from the dot 
product of the first row vector of A with the first, second, and third column vectors of 


32 


Chapter 1 Systems of Linear Equations and Matrices 


DEFINITION 3 


B, respectively. The terms in the second row of AB are the dot products of the second 
row vector of A with the first, second, and third column vectors of B, respectively. 
Finally, the terms in the third row of AB are the dot products of the third row vector 
of A again with the first, second, and third column vectors of B, respectively. Thus, 
the product matrix AB is 


(1)(3)+)(—N+()1)  -2 + 12 +0 5—6+0 
AB = 6-1-3 —4+4+0 10-2-9 
—12—6+2 8+24+0 —20—12+6 

0 10 Si 

= 2 0 -I 

-16 32 —26 


In the previous example, the product AB exists since the matrices A and B have 
the same number of rows and the same number of columns. This condition can be 
relaxed somewhat. In general, the product of two matrices A and B exists if the 
number of columns of A is equal to the number of rows of B. 


Matrix Multiplication Let A be an m x n matrix and B ann x p matrix; then 
the product AB is an m x p matrix. The ij term of AB is the dot product of the 
ith row vector of A with the jth column vector of B, so that 
n 
(AB);; = ajib1j + aj2b2; +--+ + dinbnj = So airbrj 
k=1 


It is important to recognize that not all properties of real numbers carry over to 
properties of matrices. Because matrix multiplication is only defined when the number 
of columns of the matrix on the left equals the number of rows of the matrix on the 
right, it is possible for AB to exist with BA being undefined. For example, 


3 —2 5 
aafe es —1 4 -2 
1 0 3 
is defined, but 
3 —2 5 
eee (tema 
1 0 3 


is not. As a result, we cannot interchange the order when multiplying two matrices 
unless we know beforehand that the matrices commute. We say two matrices A and 
B commute when AB = BA. 

Example 2 illustrates that even when AB and BA are both defined, they might 
not be equal. 


Solution 


Solution 


1.3 Matrix Algebra 


Verify that the matrices 


1 O 0 1 
A= | 2 and ae 


do not satisfy the commutative property for multiplication. 


The products are 


and 


so that AB # BA. 


In Example 3 we describe all matrices that commute with a particular matrix. 


Find all 2 x 2 matrices that commute with the matrix 


We start by letting B denote an arbitrary 2 x 2 matrix 
a b 
a | 
Then the product of matrix A on the left with matrix B on the right is given by 
1 0 a b a b 
| al a a 
On the other hand, 
paS | 

Setting AB = BA, we obtain 


a=a+b a+c=c+d and b+d=d 
so that b = 0 and a = d. Let S be the set of all 2 x 2 matrices defined by 


s-{]e °]] acer} 


Then each matrix in S commutes with the matrix A. 


33 


34 Chapter 1 Systems of Linear Equations and Matrices 


| EXAMPLE4 | Perform the operations on the matrices 


Solution 


-3 1 
-1 1 -1 3 3 2) 2a 
-1 5 
a. A(B+C) b. AB+AC 


We first notice that the matrix A is 3 x 2 and both B and C are 2 x 4, so AB and 
AC are defined. Also since the matrices B and C have the same number of rows 
and columns, the matrix B + C is defined, so the expressions in parts (a) and (b) 
are defined. 


a. We first add the matrices B and C inside the parentheses and then multiply 
on the left by the matrix A. This gives us 


A(B+C) 

a E Eo Fees .|) 
a 25 = i UG =Z á 
= |. 

= 2 oD 


E 3 =3 | 
=e Bill SS 
—3(2) +103) —3(3)+1(11) —3(—3)+ 1(—5) —3(4) + 15) 


= 22) +23) 23)+2C01)  2(-3)+2(-5) 24) +26) 
—1@2)+5@3) —1G)+501) —I(=3)4 ss) 
= 2 a =y 
10 28 —16 18 
BEA 22 2il 


b. In this case we compute AB and AC separately and then add the two resulting 
matrices. We have 


-3 1 
-1 1 -1 3 
AB + AC = 22 || 
-1 5 
-3 1 
3 2 -2 1 
+| 22] | 
3 | ene 
5 2 0 -8 -8 0 4 1 
=| 2 12 -8 8/+]| 8 16 —8 10 
11 24 -14 2 2 28 -8 19 


—3 2. A 
10 28 —16 18 
13. 52 P 


THEOREM 5 


DEFINITION 4 


1.3 Matrix Algebra 35 


Notice that in Example 4 the matrix equation 
A(B+C)=AB+AC 


holds. This and other familiar properties involving multiplication and addition of real 
numbers hold for matrices. They are listed in Theorem 5. 


Properties of Matrix Multiplication Let A, B, and C be matrices with sizes 
so that the given expressions are all defined, and let c be a real number. 

1. A(BC) = (AB)C 

2. c(AB) = (cA)B = A(cB) 

3. A(B+C)=AB+AC 

4. (B+C)A=BA+CA 


We have already seen that unlike with real numbers, matrix multiplication does 
not commute. There are other properties of the real numbers that do not hold for 
matrices. Recall that if x and y are real numbers such that xy = 0, then either x = 0 
or y = 0. This property does not hold for matrices. For example, let 


1 1 -1 -l 
eaei and s=| I | 


Then 


Transpose of a Matrix 


The transpose of a matrix is obtained by interchanging the rows and columns of a 
matrix. 


Transpose If A is an m x n matrix, the transpose of A, denoted by A’, is the 
n xm matrix with ij term 
(A)ij = dji 


where 1 <i <nandl<j<m. 


For example, the transpose of the matrix 


T 2k = =3 1 0 -1 
A= 0 1 4 is A = 2 1 2 
-1 2 1 -3 4 1 


Notice that the row vectors of A become the column vectors of A‘. Theorem 6 
gives some properties of the transpose. 


36 Chapter 1 Systems of Linear Equations and Matrices 


THEOREM 6 Suppose A and B are m x n matrices, C is an n x p matrix, and c is a scalar. 

1. (A+ BY =A'+B 

2. (ACY = CA! 

3. (A'Y =A 

4. (cA) = cA! 
Proof (2) We start by showing that the products involved are all defined. Since 
AC is m x p, then (AC)! is p x m. As C' is p x n and A’ is n x m, then C'A 
is also p x m. So the sizes of the products agree. Now to show that the products 
are equal, we apply the definitions of multiplication and transpose to obtain 

(C'A) = X (CDA); 


k=1 
m m 


= X cea jx = X ajc = (AC) ji 
k=l k=l 
= ((AC)');j 


The proofs of parts 1, 3, and 4 are left as exercises. 


DEFINITION 5 Symmetric Matrix An n x n matrix is symmetric provided that A’ = A. 


| EXAMPLES | Find all 2 x 2 matrices that are symmetric. 


Solution Let 
I= | 


Then A is symmetric if and only if 
la bi |@ e|_ » 
v= [tS] [5 a] 
which holds if and only if b = c. So a 2 x 2 matrix is symmetric if and only if the 


matrix has the form 
a b 
bd 


Fact Summary 


Let A, B, and C be matrices. 


1. The definitions of matrix addition and scalar multiplication satisfy many of 
the properties enjoyed by real numbers. This allows algebra to be carried 
out with matrices. 


1.3 Matrix Algebra 37 


2. When AB is defined, the ij entry of the product matrix is the dot product 
of the ith row vector of A with the jth column vector of B. 


3. Matrix multiplication does not in general commute. Even when AB and 
BA are both defined, it is possible for AB 4 BA. 

4. The distributive properties hold. That is, A(B + C) = AB + AC and 
(B+ C)A= BA+CA. 

5. (A+ BY = A’ + B’, (ABY = BA’, (A'Y = A, (cA)! = cA’ 


6. The matrix A = 7 R is symmetric if and only if b = c. 


d 


In Exercises 1—4, use the matrices 7. Find AB and BA. 


ic | 2 -3 | pe | -1 3 | 8. Show that 3(AB) = A(3B). 
f4 1 ~ 1} S . : 
In Exercises 9 and 10, use the matrices 
fi 1 2 -3 -3 
c=[; -| a=| 3 -2 j 
1. Find A + B and B + A. 3 -1 
2. Find 3A — 2B. B= a E 
3. Find (A + B) + C and A + (B + C). 
4. Find 3(A + B) — 5C. 9. Find AB. 
In Exercises 5 and 6, use the matrices 10. Find BA. 
S 11. Let ba. and 
A= 1 0 2 A= 3 -3 3 
0 —2 3 2% 2 1 
-1 3 3 and 
B=| —2 5 2 Sf 3o e3 
L O24 B= 0 —1 2, 
P 3 ; 3 —2 -1 
c=! 3 10 6 = pa AB. 
. Let 
2 2 1I ERE E 
A= —3 2 1 
5, Find (A — B) +C and 2A + B. a NE 
6. Show that A+2B—C=0. and 
In Exercises 7 and 8, use the matri aaa aes 
n Exercises 7 and 8, use the matrices pel 22 3 


a=] $i sal A e ae 
= = Find AB. 


38 Chapter 1 Systems of Linear Equations and Matrices 


In Exercises 13—16, use the matrices 
—2 -3 2 0 
A239) Pela 
2 0 
zeae 


13. Find A(B + C). 
14. Find (A+ B)C. 
15. Find 2A(B — 3C). 


16. Find (A +2B)@GC). 
In Exercises 17—24, use the matrices 


er 0 -1 

=|1 0 -2 
-3 1 1 
s=| -3 -3 ei 

_f 3 -1 

Z| -1 -3 


Whenever possible, perform the operations. If a 
computation cannot be made, explain why. 


17. 2A‘ — B' 
18. B'—2A 
19. AB’ 

20. BA’ 

21. (A! + B')C 
22. C(A! + B’) 
23. (A'C)B 
24. (A'B')C 
25. Let 


ee 


Show that AB = AC and yet BAC. 


26. Let 
0 2 
a[o 5] 


27. 


28. 


29. 


30. 


31. 


32. 


33. 


34. 


35. 


36. 


Find a 2 x 2 matrix B that is not the zero matrix, 
such that AB is the zero matrix. 


Find all 2 x 2 matrices of the form 


A 


such that 
2 |10 
A= AA = | o1 
2 1 ; : 
Let A= | ii |. Find all matrices of the form 


M = E i | such that aM = MA, 


Find matrices A and B such that AB = 0 but 
BA #0. 


Show there are no 2 x 2 matrices A and B such 


that 
1 0 
as—san[! 9] 


Determine all values of a and b such that 
1 2 3 bj) | -5S 6 
a 0 —4 1/)7— 12 16 


If A and B are 2 x 2 matrices, show that the sum 
of the terms on the diagonal of AB — BA is 0. 


Let 


Find the matrix A”. 


If A and B are n x n matrices, when does 
(A+ B)(A — B) = A? — B?? 


If the matrices A and B commute, show that 
A*B = BA’. 

Suppose A, B, and C are n x n matrices and B 
and C both commute with A. 

a. Show that BC and A commute. 


b. Give specific matrices to show that BC and 
CB do not have to be equal. 


37. 


38. 


39. 
40. 


41. 


Suppose that A is an n x n matrix. Show that if 
for each vector x in R”, Ax = 0, then A is the 
zero matrix. 


For each positive integer n, let 
l-n 
n 


Show that An Am = Ant. 


—n 
Ane l+n 
Find all 2 x 2 matrices that satisfy AA’ = 0. 


Suppose that A and B are symmetric matrices. 
Show that if AB = BA, then AB is symmetric. 


If A is an m x n matrix, show that AA’ and A'A 
are both defined and are both symmetric. 


42. 


43. 


44. 


39 


1.4 The Inverse of a Square Matrix 


An n x n matrix A is called idempotent provided 
that A? = AA = A. Suppose that A and B are 

n x n idempotent matrices. Show that if 

AB = BA, then the matrix AB is idempotent. 


An n x n matrix A is skew-symmetric provided 
A' = —A. Show that if a matrix is 
skew-symmetric, then the diagonal entries are 0. 


The trace of an n x n matrix A is the sum of the 
diagonal terms, denoted tr(A). 


a. If A and B are n x n matrices, show that 
tr(A + B) = tr(A) + tr(B). 


b. If A is an n x n matrix and c is a scalar, show 
that tr(cA) = ctr(A). 


1.4 > The Inverse of a Square Matrix 


In the real number system, the number | is the multiplicative identity. That is, for any 


real number a, 


a-l=l-a=a 


We also know that for every number x with x Æ 0, there exists the number L, also 


written x~!, such that 


x--=1 
x 


We seek a similar relationship for square matrices. For an n x n matrix A, we can 


check that the n x n matrix 


ka] 
0 0 1 0 
RSR 


is the multiplicative identity. That is, if A is any n x n matrix, then 


AI=IA=A 


This special matrix is called the identity matrix. For example, the 2 x 2,3 x 3, and 
4 x 4 identity matrices are, respectively, 


[o 2] 


1 0 0 
O 1 0 
0 0 1 


and 


ooor 
ooroe 
oreo 
er) 


40 Chapter 1 Systems of Linear Equations and Matrices 


DEFINITION 1 Inverse of a Square Matrix Let A be ann x n matrix. If there exists ann x n 
matrix B such that 
AB=I=BA 


then the matrix B is a (multiplicative) inverse of the matrix A. 


| EXAMPLE1 | Find an inverse of the matrix 


a=[} 


AIN AY 
X3 AA 


1 1 PSOE lata oa sa || | il 
iL 2 x x4 | ow || we coco | | 


This matrix equation is equivalent to the linear system 


NOR 
—— 


Solution In order for a 2 x 2 matrix B = | | to be an inverse of A, matrix B must 


satisfy 


Se) AP eS =a 
X2 + x4=0 
x, +2x3 =0 
X2 +2x4=1 
The augmented matrix and the reduced row echelon form are given by 
EO OT 100 0 2 
0 1 0 14/0 B 0 1 0 0;-!1 
1 02 00 0 0 1 0;-!1 
01 0 2)1 000 1 1 
Thus, the solution is xı = 2, x2 = —1, x3 = —1, x4 = 1, and an inverse matrix is 


The reader should verify that AB = BA = I. 


Theorem 7 establishes the uniqueness, when it exists, of the multiplicative inverse. 


THEOREM 7 The inverse of a matrix, if it exists, is unique. 


Proof Assume that the square matrix A has an inverse and that B and C are 
both inverse matrices of A. That is, AB = BA = I and AC = CA = I. We show 
that B = C. Indeed, 


B = BI = B(AC) = (BA)C = (D)C = C 


THEOREM 8 


1.4 The Inverse of a Square Matrix 41 


We refer to the unique inverse as the inverse of A and denote it by A~!. When 


the inverse of a matrix A exists, we call A invertible. Otherwise, the matrix A is 
called noninvertible. 


The inverse of the matrix A = | i ” exists if and only if ad — bc +Æ 0. In this 


case the inverse is the matrix 
A- 1 d —b 
ad —bc | —e a 
Proof First, assume that ad — bc Æ 0, and let 
hes | x1 X2 
X3 X4 


We need to find x1, x2, x3, and x4 such that 


1 0 
sefi] 


Taking the product of the two matrices yields 


axı +bx3 axı+bx4 |_| 1 0 
cxy +dx3 cx +dx4 | | 0 1 


which results in the linear system 


ax) + bx3 = 


ax + bx4=0 
CX] +dx3 =0 
cx2 +dx4=1 
The augmented matrix of this linear system is given by 
a 0 b Oj} 1 
O a 0 bO 
c 0 d 0j0 
O c 0 djl 
which reduces to 
100 0| = 
0 1 0 0|- 
0 0 1 0 a 
0 0 0 1 EA 
Since ad — bc # 0, the inverse of the matrix A is 
d b 
Av! = ad—be —— ad—be = 1 d —b 
=d TE ad — bc =E a 


To prove the reverse claim, we use the contrapositive. That is, if ad — bc = 0, 
then the inverse does not exist. An outline of the proof is given in Exercise 41. 


42 


Systems of Linear Equations and Matrices 


To illustrate the use of the formula, let 
2 -l 
ae 


“i. ere i) (ee 
‘ Sere ee ae 


For an example which underscores the necessity of the condition that ad — bc +Æ 0, 
we consider the matrix 
Aes 1 1 
~{ 11 


Observe that in this case ad — bc = 1 — 1 = 0. Now, the matrix A is invertible if 


xX} Xx 
1 2 | such that 
X3 X4 


EES EIS 


This matrix equation yields the inconsistent system 


then 


NSIN i= 
L 


there is a B = 


X1 + x3 =1 
x2 +x4=0 
xı +43 =0 
X2 +x4= 1 


Hence, A is not invertible. 
To find the inverse of larger square matrices, we extend the method of aug- 
mented matrices. Let A be an n x n matrix. Let B be another n x n matrix, and let 


B,, Bo, ..., B, denote the n column vectors of B. Since AB,;, AB2, ..., AB, are the 
column vectors of AB, in order for B to be the inverse of A, we must have 

1 (0) (0) 

(0) 1 0 

AB, = i AB, = . oe AB, = 

(0) 0 1 

That is, the matrix equations 
| 1 0 
0 1 0 
Ax = Ax = Ax = 
0 (0) 1 


must all have unique solutions. But all n linear systems can be solved simultaneously 
by row-reducing the n x 2n augmented matrix 


a11 a12 cie An 


10... 0 
a2, an ... amO 1 


Solution 


Solution 


1.4 The Inverse of a Square Matrix 43 


On the left is the matrix A, and on the right is the matrix Z. Then A will have an 
inverse if and only if it is row equivalent to the identity matrix. In this case, each 
of the linear systems can be solved. If the matrix A does not have an inverse, then 
the row-reduced matrix on the left will have a row of zeros, indicating at least one of 
the linear systems does not have a solution. 

Example 2 illustrates the procedure. 


Find the inverse of the matrix 


l =2 
A= || =I 0 
0 -1 1 


To find the inverse of this matrix, place the identity on the right to form the 3 x 6 


matrix 
1 1 -2/1 0 0 


—1 2 0/0 1 O 
0 -1 1/0 0 1 
Now use row operations to reduce the matrix on the left to the identity, while 
applying the same operations to the matrix on the right. The final result is 


1 0 Oa | 4 
ea Oi i 2 
OO Wit i 3 
so the inverse matrix is 
2 | A 
AT} =: | ee 
hot 3 


The reader should check that AAT! = AT!A = I. 


Use the method of Example 2 to determine whether the matrix 


i =i 2 
A= | 3 =3 Í 
3 =3 Íl 


is invertible. 


Following the procedure described above, we start with the matrix 
1 -1 2/1 0 0 
3 =} 1/0 i @ 
J =) I/@ @ il 


44 


Chapter 1 Systems of Linear Equations and Matrices 


After the two row operations —3R; + R2 —> R followed by —3R; + R3 > R3, 
this matrix is reduced to 


1 —1 2 1 0 0 
QO @ =9|/=2 I @ 
O @ =s|/=2 0 l 


1 -1 2 1 0 0 

O @ =| =3 1 0 

0 0 O O =i i 

The 3 x 3 matrix of coefficients on the left cannot be reduced to the identity matrix, 
and therefore, the original matrix does not have an inverse. Also notice that a 
solution does exist to 


1 
AX = IRO 
0 
while solutions to 
0 
Ax= | 1 and Ax= 1] 0 
1 


do not exist. 


The matrix A of Example 3 has two equal rows and cannot be row-reduced to the 
identity matrix. This is true for any n x n matrix with two equal rows and provides 
an alternative method for concluding that such a matrix is not invertible. 

Theorem 9 gives a formula for the inverse of the product of invertible 
matrices. 


THEOREM 9 Let A and B be n x n invertible matrices. Then AB is invertible and 
(AB)! = Ban! 
Proof Using the properties of matrix multiplication, we have 
(AB)(B7!A7!) = A(BB"!)A7! = AIA! = AA! =] 


and 

(B-'A7!)(AB) = B7'(A7'A)B = BIB = BB! =I 
Since, when it exists, the inverse matrix is unique, we have shown that the inverse 
of AB is the matrix B~!A~!. 


1.4 The Inverse of a Square Matrix 45 


| «EXAMPLE 4 | Suppose that B is an invertible matrix and A is any matrix with AB = BA. Show 


Solution 


Exercise Set 1.4 


that A and B7! commute. 


Since AB = BA, we can multiply both sides on the right by B7! to obtain 
(AB)B7! = (BA)B7! 


By the associative property of matrix multiplication this last equation can be writ- 
ten as 
A(BB7') = BAB"! 
and since BB~! = I, we have 
A= BAB" 
Next we multiply on the left by B7! to obtain 


B-'A=B7'BAB"! so B-!'A4=AB7! 


as required. 


Fact Summary 


Let A and B denote matrices. 


1. The inverse of a matrix, when it exists, is unique. 


a 2 
C 


and ad — bc £0, then AT! = = | By sae |. 


d 

. A matrix A is invertible if and only if it is row equivalent to the identity 
matrix. 

. If A and B are invertible n x n matrices, then AB is invertible and 
(AB)! =B'!A", 


In Exercises 1—16, a matrix A is given. Find A~! or 3 —2 4 
indicate that it does not exist. When A~! exists, check | 2 —4 | 
your answer by showing that AAT! = 1. ia 
i | b =2 | +: | 2 2 | 
“13 -1 
0 1 -i1 
3 | -3 1 | 5 ea 1 
` 1 2 1 2 -1 


OQ OT W 


Chapter 1 Systems of Linear Equations and Matrices 


17. Let 
: f 2 1 1 2 
2 —1 z = 
Ea eae 
3 0 Verify that AB + A can be factored as A(B + I) 
3 and AB + B can be factored as (A + I)B. 
-1 3 
18. If A is an n x n matrix, write A? + 2A + I in 
3 0 -3 factored form. 
1 2 0 19. Let p25 
0 -1 —1 
A= 
0o 0 -2 | -2 1l 
1 2 -3 b. Show that A™! = į (27 — A). 
0 2 -2 c. Show in general that for any square matrix A 
00 2 satisfying A? — 2A + 5I = 0, the inverse is 
AT! = IQI — A). 
1 x i ] 20. Determine those values of ^ for which the matrix 
2 -3 0 1 X 0 
1l 3 3 3 2 0 
1 2 1 
0 0 0 is not invertible. 
1 0 0 
=] =2° 6 21. Determine those values of ^ for which the matrix 
—2 0 2 1 0 
1 4 5 1 3 1 
aI Seal a 
0 0 2 is not invertible. 
aes 22. Determine those values of for which the matrix 
0 0 0 2 )\ 1 
1 0 0 3 2 
—5 0 0 1 2 
e e. . . ; 
is not invertible. 
1 0 -1 23. Let 
1 —1 0 1 A 0 
0 0 0 A=|1 1 1 
1 —1 1 0 0 1 


a. Determine those values of > for which A is 
invertible. 


b. For those values found in part (a) find the 
inverse of A. 


24. Determine those values of ~ for which the matrix 


` —l 0 
-1 ` —l 
0 -1 A 


is invertible. 


25. Find 2 x 2 matrices A and B that are not 
invertible but A + B is invertible. 


26. Find 2 x 2 matrices A and B that are invertible 
but A + B is not invertible. 


27. If A and B are n x n matrices and A is invertible, 
show that 


(A + B)AT!(A — B) = (A — B)A™! (A + B) 


28. If B= PAP™!, express B?, B?,..., B*, where k 
is any positive integer, in terms of A, P, and 
Pp, 

29. Let A and B be n x n matrices. 

a. Show that if A is invertible and AB = 0, then 
B=0. 
b. If A is not invertible, show there is ann x n 


matrix B that is not the zero matrix and such 
that AB = 0. 


30. Show that if A is symmetric and invertible, then 
A`! is symmetric. 

In Exercises 31—34, the matrices A and B are 

invertible symmetric matrices and AB = BA. 


31. Show that AB is symmetric. 

32. Show that A~!B is symmetric. 
33. Show that ABT! is symmetric. 
34. Show that A~'B™! is symmetric. 


35. A matrix A is orthogonal provided that 
At = A7!. Show that the product of two 
orthogonal matrices is orthogonal. 


1.4 The Inverse of a Square Matrix 47 


36. Show the matrix 


A= cosð —sin@ 
~ | sin@ cos 0 


is orthogonal. (See Exercise 35.) 


37. a. If A, B, and C are n x n invertible matrices, 
show that 


(ABC)! =c7'B"'a7! 


b. Use mathematical induction to show that for all 
positive integers k, if A1, Ao,..., Ay aren xn 
invertible matrices, then 


(Aj Ao ++ Ax) l = A AT acy 


38. An n x n matrix A is diagonal provided that 
aij = 0 whenever i Æ j. Show that if aj; # 0 for 
all i, then A is invertible and the inverse is 


[a 0 ° Lee o ] 


0o = o0 0 
a22 

0 0 E n—l 0 

0 o0 0 L 


39. Let A be an n x n invertible matrix. Show that if 
A is in upper (lower) triangular form, then AT! is 
also in upper (lower) triangular form. 


40. Suppose B is row equivalent to the n x n 
invertible matrix A. Show that B is invertible. 


41. Show that if ad — bc = 0, then A = | p 4 | is 
not invertible. 
a. Expand the matrix equation 


EES 


b. Show the 2 x 2 linear system in the variables 
xı and x3 that is generated in part (a) yields 
d = 0. Similarly, show the system in the 
variables x2 and x4 yields b = 0. 

c. Use the results of part (b) to conclude that 
ad — bc = Q. 


48 


Chapter 1 Systems of Linear Equations and Matrices 


1.5 > Matrix Equations 


In this section we show how matrix multiplication can be used to write a linear 
system in terms of matrices and vectors. We can then write a linear system as a single 
equation, using a matrix and two vectors, which generalizes the linear equation ax = b 
for real numbers. As we will see, in some cases the linear system can then be solved 
using algebraic operations similar to the operations used to solve equations involving 
real numbers. 

To illustrate the process, consider the linear system 


x— 6y—4z=-—5 
2x — 10y — 9z = —4 
—x+ 6y+5z= 3 


The matrix of coefficients is given by 


1 —6 —4 
A= 2 -10 -9 
—1 6 5 
Now let x and b be the vectors 
x —5 
x=] y and b= | —4 
z 3 


Then the original linear system can be rewritten as 
Ax=b 


We refer to this equation as the matrix form of the linear system and x as the vector 
form of the solution. 

In certain cases we can find the solution of a linear system in matrix form directly 
by matrix multiplication. In particular, if A is invertible, we can multiply both sides 
of the previous equation on the left by A~!, so that 


A7!(Ax) = A7'b 
Since matrix multiplication is associative, we have 
(A'A) x= A'b 
therefore, 
x=A!þb 
For the example above, the inverse of the matrix 


1 -6 —4 
A= 2 -10 -9 is A™ =| - 
a4 6 5 


=Ni- N 
ONI= Ww 
=N= ~] 


1.5 Matrix Equations 49 


Therefore, the solution to the linear system in vector form is given by 


25 A —5 -1 
— 4-lp — 1 1 1 = 
x=A b=] -5 3 3 —-4 |= 2 

1 0 1 3 —2 


That is, 
x=-l y=2 and Z=—2 


We have just seen that if the matrix A has an inverse, then the equation Ax = b 
has a unique solution. This fact is recorded in Theorem 10. 


THEOREM 10 If the n x n matrix A is invertible, then for every vector b, with n components, 
the linear system Ax = b has the unique solution x = A~'b. 


| «EXAMPLE 1 | Write the linear system in matrix form and solve. 


2a 
eo 


Solution The matrix form of the linear system is given by 
2 all Aa || il 
—4 3 i || 2 
Notice that since 2(3) — (1)(—4) = 10 Æ 0, the coefficient matrix is invertible. By 
Theorem 8, of Sec. 1.4, the inverse is 


: | See 
w|4 2 


Now, by Theorem 10, the solution to the linear system is 


af J] 


sl- 


ale 


so that 


DEFINITION 1 Homogeneous Linear System A homogeneous linear system is a system of 
the form Ax = 0. 


The vector x = 0 is always a solution to the homogeneous system Ax = 0, and 
is called the trivial solution. 


50 Chapter 1 Systems of Linear Equations and Matrices 


Solution 


Solution 


Let 
ies 4, il 
A = || il 3 @ and == || 3 
TTEA 
Find all vectors x such that Ax = 0. 


First observe that x = 0 is one solution. To find the general solution, we row-reduce 
the augmented matrix 


Il 2 ii © i. 2 1/0 
1 3 0/0 to 0 1 -1/0 
fi 2 | © 0 0 0/0 
From the reduced matrix we see that x3 is free with x2 = x3, and xj = —2x2 — x3 = 
—3x3. The solution set in vector form is given by 
—3t 
S= t |/teR 


t 


Notice that the trivial solution is also included in S as a particular solution with 
t=0. 


Observe that in Example 2, the coefficient matrix is not row equivalent to 7, and 
hence A is not invertible. 

If a homogeneous linear system Ax = 0 is such that A is invertible, then by 
Theorem 10, the only solution is x = 0. In Sec. 1.6 we will show that the converse 
is also true. 


Show that if x and y are distinct solutions to the homogeneous system Ax = 0, 
then x + cy is a solution for every real number c. 


Using the algebraic properties of matrices, we have that 


A(x + cy) = A(x) + A(cy) 
= Ax+ cAy 
=0+c0 
=0 


Hence, x + cy is a solution to the homogeneous system. 


The result of Example 3 shows that if the homogeneous equation Ax = 0 has 
two distinct solutions, then it has infinitely many solutions. That is, the homogeneous 


1.5 Matrix Equations 51 


equation Ax = 0 either has one solution (the trivial solution) or has infinitely many 
solutions. The same result holds for the nonhomogeneous equation Ax = b, with 
b Æ 0. To see this, let u and v be distinct solutions to Ax = b and c a real number. 
Then 


A(v + c(u— v)) = Av + A(c(u — v)) 
= Av+ cAu—cAv 
= b + cb — cb = b 


These observations are summarized in Theorem 11. 


THEOREM 11 If A is an m x n matrix, then the linear system Ax = b has no solutions, one 
solution, or infinitely many solutions. 


Fact Summary 


Let A be an m x n matrix. 


. If m = n and A is invertible, then for every n x 1 vector b the matrix 
equation Ax = b has a unique solution given by x = A7'b. 

. If m =n and A is invertible, then the only solution to the homogeneous 
equation Ax = 0 is the trivial solution x = 0. 

. If u and v are solutions to Ax = 0, then the vector u + cv is another 
solution for every scalar c. 

. The linear system Ax = b has a unique solution, infinitely many solutions, 
or no solution. 


Exercise Set 1.5 O ü(€ 


In Exercises 1—6, find a matrix A and vectors x and b 3y-2z= 2 
such that the linear system can be written as Ax = b. 4.2 -=x +47=-3 
2x +3y=—1 =x —32= 4 
—-x+2y= 4 
4x1 + 3x2 — 2x3 — 3x4 = — 1 
2 —4x — y=3 5: —3x1 — 3x2 + x3 = 4 
` | -2x -—5y=2 2x, — 3x2 + 4x3 — 4x4 = 3 
2x —3y+ z=-1 3x9 + x3 — 2x4 = —4 
3. ¢ —-x-— y+2z=-1 6. 4x —2x3-4x4= 0 


3x —2y-2z= 3 Xx, + 3x2 — 2x3 = 3 


52 Chapter 1 Systems of Linear Equations and Matrices 


In Exercises 7—12, given the matrix A and vectors x 


and b, write the equation Ax = b as a linear system. 14. 4 3 4 
-1 
2 5 p 3 A™ = 2- 2 0 
c eel ia a 
2 
Sas | a ae 2 b=| 2 
0 3 y 1 
—2 
0 -2 0 
9 A=|2 -1 -1 15. 
3 -l1 2 —3 -2 0 3 
—1 2 -2 3 
3 pl 
= 3 "E $ 0 1 2 -3 
4 -1 | 3 1 
z —1 
-4 -5 5 | 7 ] 
10. A=| 4 -1 1 b=] = 
= g5 ; 
x —3 
x= | y | b= 2 16 
z 1 ` ae ee ae?) 
2 0 1 -l 
if 2 SS 3 A= 
eee 1 -2 4 —3 -1 -l 1 
2 -1 -2 -3 
[> 
y= X2 b= 2 1 
X3 0 —4 
b= 
X4 1 
0 -2 4 -2 
12. A=] 2 0 1 1 In Exercises 17—22, solve the linear system by finding 
1 0 1 -2 the inverse of the coefficient matrix. 
Xx] 4 
4y= 2 
a on ee 7.47 
= | xa oS ' Ia 
X4 


In Exercises 13—16, use the information given to solve 
the linear system Ax = b. 


=-1 
13. 7 
j = sis y—3z= 


2 
4 x—3y+2z= 1 
1 
1 
b= | —4 20. 
1 


— 


—2x-2y-— z= 0 
—-x- y =-1 
— y+2z= 2 


— xX, —x2-—2x34+ xy = -1 
21 2x1 + x2 + 2x3 — X4 = 1 
g 2x1 X2 2x3 2x4 = 0 
2x1 X2 X3 X4 = (0) 
—x] — 2x2 + x4=-3 
22 —xı + X2 — 2x3 + x4 = —2 
j —x1 + 2x2 — 2x3 + X4 = 3 
— 2x7 + 2x3 — 2x4 = — 1 

23. Let 


1 -1 
elie 
Use the inverse matrix to solve the linear system 
Ax = b for the given vector b. 


sas [il 


24. Let 
—1 0 —1 
A=] -3 1 -3 
1 -3 2 


Use the inverse matrix to solve the linear system 
Ax = b for the given vector b. 


—2 

a. b= 1 

1 

b. b= | —1 

0 

25. Let 

—1 —4 
A= 3 12 
2 8 


Find a nontrivial solution to Ax = 0. 


26. Let 
1 —2 4 
A=|2 —4 8 
3 —6 12 


Find a nontrivial solution to Ax = 0. 


1.5 Matrix Equations 53 


27. Find a nonzero 3 x 3 matrix A such that the 
vector 


is a solution to Ax = 0. 


28. Find a nonzero 3 x 3 matrix A such that the 
vector 
—1 
2 
1 


is a solution to Ax = 0. 


29. Suppose that A is an n x n matrix and u and v 
are vectors in R”. Show that if Au = Av and 
u + v, then A is not invertible. 


30. Suppose that u is a solution to Ax = b and that v 
is a solution to Ax = 0. Show that u + v is a 
solution to Ax = b. 


31. Consider the linear system 


2x+ y= | 
—x+ y=-2 
x+2y=-1 


a. Write the linear system in matrix form Ax = b 
and find the solution. 


b. Find a 2 x 3 matrix C such that CA = T. (The 
matrix C is called a left inverse.) 


c. Show that the solution to the linear system is 
given by x = Cb. 


32. Consider the linear system 


2x+ y= 3 
—x— y=-2 
3x+2y= 5 


a. Write the linear system in matrix form Ax = b 
and find the solution. 


b. Find a 2 x 3 matrix C such that CA = I. 


c. Show that the solution to the linear system is 
given by x = Cb. 


54 Chapter 1 Systems of Linear Equations and Matrices 


[ 1.6 > Determinants 


DEFINITION 1 


In Sec. 1.4 we saw that the number ad — bc, associated with the 2 x 2 matrix 

a b 

c d 
has special significance. This number is called the determinant of A and provides 
useful information about the matrix. In particular, using this terminology, the matrix 
A is invertible if and only if the determinant is not equal to 0. In this section the 
definition of the determinant is extended to larger square matrices. The information 
provided by the determinant has theoretical value and is used in some applications. 
In practice, however, the computational difficulty in evaluating the determinant of a 


very large matrix is significant. For this reason the information desired is generally 
found by using other more efficient methods. 


Determinantofa2 x 2 Matrix The determinant of the matrix A = | j ; f 


denoted by |A| or det(A), is given by 


b 


|A| = det(A) = i 


[= ad-e 


Using this terminology a 2 x 2 matrix is invertible if and only if its determinant 
is nonzero. 


| EXAMPLE1 | Find the determinant of the matrix. 


Solution 


DEFINITION 2 


a A= be 42 ae © A= ri 
—2 2 ab 2 =3 @ 


a. |A| = - i |= @O@- Wer =8 
b. |A| = i 2 [= O@- O0 = 

1 0 
e lAl=| 3 9 |=OO-O-3)=0 


Using the determinant of a 2 x 2 matrix, we now extend this definition to 3 x 3 


matrices. 


Determinant of a3 x 3 Matrix The determinant of the matrix 
a1 412 443 
A= | a an az 
431 432 433 


+ 1+ 


Solution 


I+ 
+ 1+ 


Figure 1 


1.6 Determinants 55 


Al= an a23 21 423 21 an2 
|A] = ayy — an + a13 
433 a31 433 a31 432 
The computation of the 3 x 3 determinant takes the form 
* x% * CE E * Ke TE 
|[A]=a1| * an az |-an| an * a3 |+ag| 421 42 * 
* a32 433 a31 * = 33 a31 432 * 


where the first 2 x 2 determinant is obtained by deleting the first row and first column, 
the second by deleting the first row and second column, and the third by deleting the 
first row and third column. 


Find the determinant of the matrix 


2 l =i 
A=] 3 1 
J =3 3 


By Definition 2, the determinant is given by 


det(A) = Al = 2| E -1| : : +0] : E | 
= (2) B — (—12)] — (1)(9 — 20) + (-1)(-9 — 5) 
= 30411414 


= 55 


In Example 2, we found the determinant of a 3 x 3 matrix by using an expansion 
along the first row. With an adjustment of signs the determinant can be computed by 
using an expansion along any row. The pattern for the signs is shown in Fig. 1. The 
expansion along the second row is given by 


[2-1 
zgo eee 3 
= —3(3 — 3) + (6+ 5) — 4(—6 — 5) = 55 


det(A) = |A| = — | -45 A 


The 2 x 2 determinants in this last equation are found from the original matrix by 
deleting the second row and first column, the second row and second column, and 
the second row and the third column, respectively. Expansion along the third row is 
found in a similar way. In this case 


1 = 2 =I 2 1 
sa = 1A = 5] | 2 [-<9] 3 2 [+3] 3 Ñ 


= 5(444+ 1) +3(84+ 3) +3(2—- 3) = 55 


56 


Chapter 1 Systems of Linear Equations and Matrices 


DEFINITION 3 


DEFINITION 4 


THEOREM 12 


DEFINITION 5 


The determinant can also be computed using expansions along any column in a 
similar manner. The method used to compute the determinant of a 3 x 3 matrix can 
be extended to any square matrix. 


Minors and Cofactors of a Matrix If A is a square matrix, then the minor 
Mij, associated with the entry a;;, is the determinant of the (n — 1) x (n — 1) 
matrix obtained by deleting row i and column j from the matrix A. The cofactor 
of dij is Cij = (—1)'*) M;j. 


For the matrix of Example 2, several minors are 
1 4 3 4 3 1 
-3 3 S13 5 =3 


Using the notation of Definition 3, the determinant of A is given by the cofactor 
expansion 


My = My = and M3 = 


det(A) = a11C11 + a12C12 + a13C13 
= 2(—1)?(15) + 1(—1}(—11) — 1D (14 


=30 +11 +14= 55 


Determinant of a Square Matrix If A is an n x n matrix, then 
det(A) = a11C11 + a12C12 + -+ © + ainCin = X aC 
k=1 


Similar to the situation for 3 x 3 matrices, the determinant of any square matrix 
can be found by expanding along any row or column. 


Let A be an n x n matrix. Then the determinant of A equals the cofactor expansion 
along any row or any column of the matrix. That is, for every i = 1,...,n and 
J= 


det(A) = aj1Cj + di2Ci2 + +++ + din Cin = X ain Ci 
k=l 
and 


det(A) = aijCij + ajC2j + +anjCnj = So aj Ckj 
k=1 


For certain square matrices the computation of the determinant is simplified. One 
such class of matrices is the square triangular matrices. 


Triangular Matrices An m x n matrix is upper triangular if a;; = 0, for all 
i > j, and is lower triangular if aj; = 0, for all i < j. A square matrix is a 
diagonal matrix if aj; = 0, for all i Æ j. 


THEOREM 13 


1.6 Determinants 


Some examples of upper triangular matrices are 


l1 2 —1 0 1 1 0 
| 02 0 0 3 and 0 0 0 

0 0 2 0 0 1 

and some examples of lower triangular matrices are 

1 0 0 

1 0 ae aa 00 0 

0 1 0 and 
1 1 102 1 3 1 
0 1 2 


If A is an n x n triangular matrix, then the determinant of A is the product of the 


terms on the diagonal. That is, 


det(A) = ayy + a22 +++ Ann 


Proof We present the proof for an upper triangular matrix. The proof for a 
lower triangular matrix is identical. The proof is by induction on n. If n = 2, then 
det(A) = a11a22 — 0 and hence is the product of the diagonal terms. 

Assume that the result holds for an n x n triangular matrix. We need to show 
that the same is true for an (n + 1) x (n + 1) triangular matrix A. To this end let 


Qi. a2 3+ An Mny 
0 an az +++ Am Mny 
0 O 433 +++ aan Mny 
A= 
0 0 O >- Am Anant 
0 0 0 oes 0 An+1,n+1 


Using the cofactor expansion along row n + 1, we have 


ai ap ap 
0 an ax 


det(A) = (IADR D aingi 9 0 433 


0 0 0 


Since the determinant on the right is n x n and upper triangular, by the inductive 


hypothesis 


det(A) = (—1)”¥? Gn4in41)(a11422 +++ nn) 


= 411422 ++ ` Ann Gn+1,n+1 


Properties of Determinants 


= Ooo 


din 
an 
a3n 


Ann 


57 


Determinants for large matrices can be time-consuming to compute, so any properties 
of determinants that reduce the number of computations are useful. Theorem 14 shows 


how row operations affect the determinant. 


58 


Chapter 1 Systems of Linear Equations and Matrices 


THEOREM 14 


Let A be a square matrix. 
1. If two rows of A are interchanged to produce a matrix B, then det(B) = 
— det(A). 
2. If a multiple of one row of A is added to another row to produce a matrix B, 
then det(B) = det(A). 
3. If a row of A is multiplied by a real number a to produce a matrix B, then 
det(B) = adet(A). 
Proof (1) The proof is by induction on n. For the case n = 2 let 


a b 
A| a 
Then det(A) = ad — bc. If the two rows of A are interchanged to give the matrix 
c d 
a 


then det(B) = bc — ad = — det(A). 

Assume that the result holds for n x n matrices and A is an (n + 1) x (n + 1) 
matrix. Let B be the matrix obtained by interchanging rows i and j of A. Expanding 
the determinant of A along row i and of B along row j, we have 


det(A) = ai1Ci1 + ai2Ci2 + +++ + ainCin 

and 

det(B) = ajıDji + aj2Dj2 +++ ajn Djin 

= ai Dji + ai2Dj2 +++ + ain Djn 

where C;j and D;; are the cofactors of A and B, respectively. To obtain the result 
there are two cases. If the signs of the cofactors C;; and Dj; are the same, then 
they differ by one row interchanged. If the signs of the cofactors C;; and Dj; are 
opposite, then they differ by two rows interchanged. In either case, by the inductive 
hypothesis, we have det(B) = — det(A) 


The proofs of parts 2 and 3 are left as exercises. 


We note that in Theorem 14 the same results hold for the similar column oper- 
ations. To highlight the usefulness of this theorem, recall that by Theorem 13, the 
determinant of a triangular matrix is the product of the diagonal entries. So an alter- 
native approach to finding the determinant of a matrix A is to row-reduce A to 
triangular form and apply Theorem 14 to record the effect on the determinant. This 


method is illustrated in Example 3. 


Find the determinant of the matrix 


o a Tam 
2 A o 
2 we o a? 
-2 A 


1.6 Determinants 59 


Solution Since column 1 has two zeros, an expansion along this column will involve the 
fewest computations. Also by Theorem 14, if row 2 is added to row 4, then the 
determinant is unchanged and 


det(A) = 


Sow = 
(es) e dS 
O 
N 


Expansion along the first column gives 


Me E 
det(A) = -213 9 2 
0 5 E 


We next perform the operation —3R,; + Rp —> R», leaving the determinant again 
unchanged, so that 


1 3 =l 
det(A) = -2| 0 0 35 
0 = =2 
Now, interchanging the second and third rows gives 
1 5 =] 
det(A) = (—2)(—1)| 0 -5 -2 
0 oO 5 


This last matrix is triangular, thus by Theorem 13, 


det(A) = (—2)(—D[C)(-5)(5)] 


Theorem 15 lists additional useful properties of the determinant. 


THEOREM 15 Let A and B be n x n matrices and a a real number. 
1. The determinant computation is multiplicative. That is, 


det(AB) = det(A) det(B) 


2. det(aA) = a” det(A) 

3. det(A’) = det(A) 

4. If A has a row (or column) of all zeros, then det(A) = 0. 
5. If A has two equal rows (or columns), then det(A) = 0. 
6 


. If A has a row (or column) that is a multiple of another row (or column), then 
det(A) = 0. 


60 Chapter 1 Systems of Linear Equations and Matrices 


Solution 


THEOREM 16 


COROLLARY 1 


THEOREM 17 


Let A= | : E | and B = | : a | . Verify Theorem 15, part 1. 


In this case the product is 

3 7 

1 -11 

so that det(A B) = —33 — 7 = —40. We also have det(A) det(B) = (—8) (5) = —40. 


AB =| 


Properties of the determinant given in Theorem 15 can be used to establish the 
connection between the determinant and the invertibility of a square matrix. 


A square matrix A is invertible if and only if det(A) Æ 0. 
Proof If the matrix A is invertible, then by Theorem 15, 
1 = det(/) = det(AA™') = det(A) det(A~!) 


Since the product of two real numbers is zero if and only if at least one of them 
is zero, we have det(A) Æ 0 [also det(A~!) H Oj. 

To establish the converse, we will prove the contrapositive statement. Assume 
that A is not invertible. By the remarks at the end of Sec. 1.4, the matrix A is 
row equivalent to a matrix R with a row of zeros. Hence, by Theorem 14, there is 
some real number k 4 0 such that det(A) = k det(R), and therefore by Theorem 15, 
part 4, 

det(A) = kdet(R) = k(0) = 0 


Let A be an invertible matrix. Then 
1 


det(A™) = det(A) 


Proof If A is invertible, then as in the proof of Theorem 16, det(A) Æ 0, 
det(A~!) Æ 0, and 

det(A) det(A7!) = 1 
Therefore, 


det(A!) = FETE) 


The final theorem of this section summarizes the connections between inverses, 
determinants, and linear systems. 


Let A be a square matrix. Then the following statements are equivalent. 
1. The matrix A is invertible. 
2. The linear system Ax = b has a unique solution for every vector b. 


The graph of the equation 


Gah. GH hy 
a= + Ea 
is an ellipse with center 
(h, k), horizontal axis of 
length 2a, and vertical 
axis of length 2b. 


1 


Solution 


1.6 Determinants 61 


3. The homogeneous linear system Ax = 0 has only the trivial solution. 
4. The matrix A is row equivalent to the identity matrix. 
5. The determinant of the matrix A is nonzero. 


Determinants can be used to find the equation of a conic section passing through 
specified points. In the 17th century, Johannes Kepler’s observations of the orbits of 
planets about the sun led to the conjecture that these orbits are elliptical. It was Isaac 
Newton who, later in the same century, proved Kepler’s conjecture. The graph of an 
equation of the form 


Ax? + Bry +Cy? + Dx + Ey+F=0 


is a conic section. Essentially, the graphs of conic sections are circles, ellipses, hyper- 
bolas, or parabolas. 


An astronomer who wants to determine the approximate orbit of an object travel- 
ing about the sun sets up a coordinate system in the plane of the orbit with the 
sun at the origin. Five observations of the location of the object are then made 
and are approximated to be (0, 0.31), (1, 1), (1.5, 1.21), (2, 1.31), and (2.5, 1). 
Use these measurements to find the equation of the ellipse that approximates the 
orbit. 


We need to find the equation of an ellipse in the form 
Ax? + Bxy + Cy? + Dx+Ey+F=0 


Each data point must satisfy this equation; for example, since the point (2, 1.31) is 
on the graph of the conic section, 


A(2)* + B(2)(1.31) + C(.31)? + DQ) + E(1.31) + F =0 


so 
4A + 2.62B +1.7161C +2D+131E+F=0 


Substituting the five points in the general equation, we obtain the 5x 6 linear 
system (with coefficients rounded to two decimal places) 


0.1C + 03TE Fra 

A+ B+ Cs D-- li 521? =) 
4A+2.62B+1.72C+ 2D+131E+F=0 
2.25A + 1.82B+1.46C+15D+121E+ F=0 
6.25A+ 2.5B + (C op 2.3/D) 15,57 Jé =) 


Since the equation Ax? + Bxy + Cy? + Dx + Ey + F =0 describing the 
ellipse passing through the five given points has infinitely many solutions, by 
Theorem 17, we have 


62 


Chapter 1 


Figure 2 


Systems of Linear Equations and Matrices 


x xy yo ë g y il 
0 0 01 0 0.31 1 
1 1 1 1 1 ie | 0 
4 2.62 172 ee ese 

2.25 1.82 AG IESE? 

6.25 25 1255 1 1 

Expanding the determinant gives us the equation 
x —0.014868x? + 0.0348xy — 0.039y* + 0.017238x — 0.003y + 0.00483 = 0 


The graph of the orbit is shown in Fig. 2. 


Cramer’s Rule 


Determinants can also be used to solve linear systems. To illustrate the technique 
consider the 2 x 2 linear system 


ax+by=u 
cx+dy=v 


with ad — bc #0. By Theorem 17, the linear system has a unique solution. 
To eliminate the variable y, we multiply the first equation by d and the second 
equation by b, and then we subtract the two equations. This gives 


adx + bdy — (bcx + bdy) = du — bv 


Simplifying, we have 


du — bv 
(ad — bc)x = du — bv so that x= 
ad — be 
Using a similar procedure, we can solve for y. 
_ av—cu 
ag ad — bc 
Using determinants, we can write the solution as 
u b a u 
v d cC v 
x = -————_ and y = ———_ 
a b a b 
c d c d 


Notice that the solutions for x and y are similar. The denominator for each is the 
determinant of the coefficient matrix. The determinant in the numerator for x is formed 
by replacing the first column of the coefficient matrix with the column of constants 
on the right-hand side of the linear system. The determinant in the numerator for y is 
formed by replacing the second column of the coefficient matrix with the column of 
constants. This method of solving a linear system is called Cramer’s rule. 


Solution 


THEOREM 18 


1.6 Determinants 63 


Use Cramer’s rule to solve the linear system. 


Die se iy = 2 
—5x+7y =3 


The determinant of the coefficient matrix is given by 


2 
=9 7 


and since the determinant is not zero, the system has a unique solution. The solution 
is given by 


L (i= 


W 3 DEP, 
J y 14—9 5 d —5 3 6 — (—10) 16 
— = = ni — — = 
* 29 29° 5 `“ y 29 29 29 


Cramer's Rule Let A beann x n invertible matrix, and let b be a column vector 
with n components. Let A; be the matrix obtained by replacing the ith column of A 


x 
X2 
with b. If x = : is the unique solution to the linear system Ax = b, then 
Xn 
det(A; 
xj = ay i=1,2,...,n 
det(A) 


Proof Let J; be the matrix obtained by replacing the ith column of the identity 
matrix with x. Then the linear system is equivalent to the matrix equation 


Al; = Aj so det(AJ;) = det(A;) 
By Theorem 15, part 1, we have 
det(A) det(/;) = det(A/;) = det(A;) 
Since A is invertible, det(A) 4 0 and hence 
det(A;) 
det(A) 
Expanding along the ith row to find the determinant of I; gives 


det(/;) = x; det) = x; 


det(J;) = 


where J is the (n — 1) x (n — 1) identity. Therefore, 
det(A;) 
Xi = 
det(A) 


64 Chapter 1 Systems of Linear Equations and Matrices 


Solution 


If a unique solution exists, then Cramer’s rule can be used to solve larger square 
linear systems. Example 7 illustrates the technique for a 3 x 3 system. 


Solve the linear system. 
2x+3y— z= 2 
3x —2y+ z=-l 
—5x — 4y +2z= 3 


The determinant of the coefficient matrix is given by 


2 3 -l1 
3 =2 1 |=—11 
Z =A 
By Cramer’s rule the solution to the system is 
2 3 -l 
1 5 
x==>| 1 32 
mM ae 11 
1 2 2 —1 36 
y= —— 3 -l1 i 
11 | 5 (eon ee 
2 3 2 
1 
z= -| 2S -2 
aE ae 


The reader should verify this solution by substitution into the original system. 


Fact Summary 


Let A and B be n x n matrices. 


1. aet| ¢ 4 | sad be 
c d 


2. The determinant of A can be computed by expanding along any row or 
column provided that the signs are adjusted using the pattern 


+ + 


| 


3. The matrix A is invertible if and only if det(A) 4 0. 


| ap | 
ap il ae ll 


+ 


ap a 


1.6 Determinants 65 


4. If A is a triangular matrix, then the determinant of A is the product of the 
diagonal terms. 


5. If two rows of A are interchanged, the determinant of the resulting matrix 
is the negative of the determinant of A. 


6. If a multiple of one row of A is added to another row, the determinant is 
not altered. 


7. If one row of A is multiplied by a scalar c, the determinant of the resulting 
matrix is c times the determinant of A. 


8. det(AB) = det(A) det(B), det(cA) = c” det(A), det(A’) = det(A) 
9. If A has a row or column of zeros, then det(A) = 0. 


10. If one row or column of A is a multiple of another row or column, then 
det(A) = 0. 
11. If A is invertible, then det(A~') = Jam: 


Exercise Set 1.6 O ü 


In Exercises 1—4, evaluate the determinant of the 1 0 0 
matrix by inspection. 7136 0 
2 —40 10 0 8 -l 
1. | 0 3 12 eee | 
o & 4 8 | 7 2 1 
1 2 3 3 6 6 
2 4 5 6 | 9. Answer the questions using the matrix 
1 2 3 
2 0 1 
1 0 0 0 A= 3 -1 4 
3 3 -1 0 0 —4 i 3-2 
"“|4 #2 2 0 
1 1 6 5 a. Find the determinant of the matrix by using an 
expansion along row 1. 
4 r = 2 b. Find the determinant of the matrix by using an 
: 1 E 1 expansion along row 2. 
c. Find the determinant of the matrix by using an 
In Exercises 5—8, use determinants to decide if the expansion along column 2. 
matrix is invertible. d. Interchange rows | and 3 of the matrix, and 
5 2 -1 find the determinant of the transformed matrix. 
—2 2 e. Multiply row 1 of the matrix found in part (d) 


by —2, and find the determinant of the new 
6. | l 3 | matrix. Use the value to find the determinant of 
JE the original matrix. 


66 Chapter 1 Systems of Linear Equations and Matrices 


f. Replace row 3 of the matrix found in part (e) 
with the sum of row 3 and —2 times row 1 and 
find the determinant of the new matrix in two 
ways. First, use an expansion along row 3 of 
the new matrix. Second, use the value for the 
determinant of A that has already been 
calculated. 

g. Does the matrix A have an inverse? Do not try 
to compute the inverse. 


10. Answer the questions using the matrix 


-1 1 1 2 

3 -2 0 -1 

a 0 1 0 1 
3 3 3 3 


a. Find the determinant of the matrix by using an 
expansion along row 4. 

b. Find the determinant of the matrix by using an 
expansion along row 3. 

c. Find the determinant of the matrix by using an 
expansion along column 2. 

d. In (a), (b), and (c), which computation do you 
prefer, and why? 

e. Does the matrix A have an inverse? Do not try 
to compute the inverse. 


In Exercises 11—26, find the determinant of the 
matrix. Specify whether the matrix has an inverse 
without trying to compute the inverse. 


EE 
1 | 
1| 
w[i] 
| 
| 


15. 


5 -5 —4 
17. -1 -3 5 
—3 1 3 
3 -—3 5 
18. 2 4 -3 
—3 -1 -5 
—3 4 5 
19. 1 1 4 
-1 -3 4 
—2 —2 -4 
20. 1 1 3 
—4 0 4 
1 —4 1 
21. 1 —2 4 
0 2 3 
12 4 
22. 4 0 0 
12 4 
2 —-2 -2 -2 
2 2 3 0 
a —2 -2 2 0 
1 -1 -3 -l1 
1 -l 0 0 
3 3 1 1 
24. -1 -1 -3 2 
-1 -2 2 1 
—1 1 1 0 
0 0 -l 0 
25. (0) 0 1 -l 
(0) 1 1 (0) 
1 —1 1 1 
1 0 -l 0 
-1 -l 0 0 
26. 1 0 0 0 
0 1 1 1 
—1 1 1 —1 
In Exercises 27—30, let 
a 
A=|d 
8 


and assume det(A) = 10. 


~s 9 


27. 
28. 


29. 
30. 


31. 


32. 


33. 


34. 


Find det(3A). 
Find det(2A~!). 
Find det [(2A)~'] . 
Find 


a 


g d 
det] b h e 
i f 


Find x, assuming 


x? x 2 
det | 2 1 1 | =0 
0 0 —5 
Find the determinant of the matrix 

1 1 1 1 1 

O 1 1 1 1 

1 O 1 1 1 

1 1 O 1 1 

1 1 1 0O 1 


Suppose a; # bı. Describe the set of all points 
(x, y) that satisfy the equation 
1 1 1 
det | x aq, bi =0 
y a by 


Use the three systems to answer the questions. 


x+ y=3 


x+ y=3 
Oe 


Oi 


a. Form the coefficient matrices A, B, and C, 
respectively, for the three systems. 


b. Find det(A), det(B), and det(C). How are they 
related? 


c. Which of the coefficient matrices have 
inverses? 


d. Find all solutions to system (1). 
e. Find all solutions to system (2). 
f. Find all solutions to system (3). 


35. 


36. 


37. 


1.6 Determinants 67 


Answer the questions about the linear system. 
x— y—-2z= 3 
—x+2y+3z= 1 
2x — 2y — 2z = —2 
a. Form the coefficient matrix A for the linear 
system. 
b. Find det(A). 
c. Does the system have a unique solution? 
Explain. 
d. Find all solutions to the system. 
Answer the questions about the linear system. 
x+3y—-—2z=-1 
2x+5y+ z= 2 
2x + 6y — 4z = —2 
a. Form the coefficient matrix A for the linear 
system. 
b. Find det(A). 
c. Does the system have a unique solution? 
Explain. 
d. Find all solutions to the system. 
Answer the questions about the linear system. 
x z=-1 
2x +2z= 1 
x-—3y-3z= 1 


a. Form the coefficient matrix A for the linear 
system. 

b. Find det(A). 

c. Does the system have a unique solution? 
Explain. 

d. Find all solutions for the system. 


In Exercises 38—43, use the fact that the graph of the 
general equation 


Ax? + Bxy +Cy* + Dx+Ey+F=0 


is essentially a parabola, circle, ellipse, or hyperbola. 


38. 


a. Find the equation of the parabola in the form 
Ax? + Dx+Ey+F=0 


that passes through the points (0, 3), 
(1, 1), and (4, —2). 
b. Sketch the graph of the parabola. 


39. 


40. 


41. 


42. 


43. 


. Sketch the graph of the circle. 
. Find the equation of the hyperbola in the form 48. 


. Sketch the graph of the hyperbola. 


. Find the equation of the ellipse in the form 


Chapter 1 Systems of Linear Equations and Matrices 


. Find the equation of the parabola in the form In Exercises 44—51, use Cramer’s rule to solve the 


Cy?+ Dx+Ey+F=0 linear system. 


that passes through the points (—2, —2), (3, 2), 44. { ARNA 


and (4, —3). aaa 
. Sketch the graph of the parabola. 45. 5x —Sy=7 
. Find the equation of the circle in the form ee al 
A(x? +y) + Dx + Ey+F=0 46. pe : 
that passes through the points (—3, —3), (—1, 2), 
and (3, 0). 47. —9x —4y= 3 


—7x + 5y =-—10 


—10x— 7y=-12 

2 2 12x+1ly= 5 
Ax’ + Cy + Dx+ Ey+F=0 

—x —3y=4 

that passes through the points (0, —4), (0, 4), 49. { -8x +4y =3 

(1, —2), and (2, 3). 

—2x+ y—4z=—8 

fe t= 3 

— z=-8 


50. 


Ax? + Cy? + Dx+ Ey+F =0 2x+3y+2z=-—2 
x —3y—8z=-2 


-3x +2y-7z= 2 


51. 


that passes through the points (—3, 2), (—1, 3), 
(1, —1), and (4, 2). 


. Sketch the graph of the ellipse. 52. An n x n matrix is skew-symmetric provided 
f i eo A’ = —A. Show that if A is skew-symmetric and 
. Find the equation of the ellipse in the form n is an odd positive integer, then A is not 
Ax? + Bey + Cy? +Dx+Ey+F=0 invertible. 
that passes through the points (—1, 0), (0, 1), 53. If A isa3 x 3 matrix, show that det(A) = det(A’). 
(1, 0), (2, 2), and (3, 1). 54. If A is an n x n upper triangular matrix, show 
. Sketch the graph of the ellipse. that det(A) = det(A’). 


1.7 > Elementary Matrices and LU Factorization 


In Sec. 1.2 we saw how the linear system Ax = b can be solved by using Gaussian 
elimination on the corresponding augmented matrix. Recall that the idea there was 
to use row operations to transform the coefficient matrix to row echelon form. The 
upper triangular form of the resulting matrix made it easy to find the solution by using 
back substitution. (See Example 1 of Sec. 1.2.) In a similar manner, if an augmented 
matrix is reduced to lower triangular form, then forward substitution can be used to 
find the solution of the corresponding linear system. For example, starting from the 


DEFINITION 1 


1.7 Elementary Matrices and LU Factorization 69 


first equation of the linear system 


X1 = 3 
—x, + x2 =-—1 
2x1 — x2 +x3= 5 


we obtain the solution x; = 3, x2 = 2, and x3 = 1. Thus, from a computational per- 
spective, to find the solution of a linear system, it is desirable that the corresponding 
matrix be either upper or lower triangular. 

In this section we show how, in certain cases, an m x n matrix A can be written 
as A = LU, where L is a lower triangular matrix and U is an upper triangular matrix. 
We call this an LU factorization of A. For example, an LU factorization of the 


matrix | 3 4 is given by 
-3 -—2) =| -1 0 3 2 
3 As = 1 2 0 1 
: -1 0 3 2 i . . 
with L = 12 and U = oak We also show in this section that when such 


a factorization of A exists, a process that involves both forward and back substitution 
can be used to find the solution to the linear system Ax = b. 


Elementary Matrices 


As a first step we describe an alternative method for carrying out row operations using 
elementary matrices. 


Elementary Matrix An elementary matrix is any matrix that can be obtained 
from the identity matrix by performing a single elementary row operation. 


As an illustration, the elementary matrix E, is formed by interchanging the first 
and third rows of the 3 x 3 identity matrix Z, that is, 


0 0 1 
E;=};0 1 0 
1 0 0 


Corresponding to the three row operations given in Theorem 2 of Sec. 1.2, there 
are three types of elementary matrices. For example, as we have just seen, Fj is 
derived from J by means of the row operation R; <> R3 which interchanges the first 
and third rows. Also, the row operation kR; + R2 —> Rz applied to J yields the 
elementary matrix 


- OO 


1 0 
Fyo= | ko 
0 0 


70 Chapter 1 Systems of Linear Equations and Matrices 


Next, if c Æ 0, the row operation cR? —> Rz performed on J produces the matrix 


1 0 0 
E3 = 0c 0 
0 0 1 


Using any row operation, we can construct larger elementary matrices from larger 
identity matrices in a similar manner. 

We now show how elementary matrices can be used to perform row operations. 
To illustrate the process, let A be the 3 x 3 matrix given by 


rod. 33 
A=|]4 5 6 
7 8 9 
Multiplying A by the matrix FE), defined above, we obtain 
7 8 9 
E;A=}]4 5 6 
1 2 3 


Observe that E; A is the result of interchanging the first and third rows of A. Theorem 19 
gives the situation in general. 


THEOREM 19 Let A be an m x n matrix and E the elementary matrix obtained from the m x m 
identity matrix 7 by a single row operation R. Denote by R(A) the result of 
performing the row operation on A. Then (A) = EA. 


By repeated application of Theorem 19, a sequence of row operations can be 
performed on a matrix A by successively multiplying A by the corresponding ele- 
mentary matrices. Specifically, let E; be the elementary matrix corresponding to the 
row operation R; with 1 <i < k. Then 


Rk- Ra (R1 (A)) = Ep: ++ Ep EA 


| EXAMPLE1 | Let A be the matrix 
1 2 -1 
A 3 5 0 
—1 1 1 
Use elementary matrices to perform the row operations Ri: R2 —3R,; — Ro, 
Ro: R3 +R; — R3, and R3: R3+3R2 — R3. 
Solution The elementary matrices corresponding to these row operations are given by 
1 0 0 1 00 1 0 0 
Ei= | 3RR. Em | 0 i @ je || 0) i @ 
0 0 1 101 0 3 1 


THEOREM 20 


1.7 Elementary Matrices and LU Factorization 71 


respectively, so that 


1 2 si 
Es PEIA = | 0a 
0 0 9 


The reader should check that the matrix on the right-hand side is equal to the result 
of performing the row operations above on A in the order given. 


The Inverse of an Elementary Matrix 


An important property of elementary matrices is that they are invertible. 


Let E be ann x n elementary matrix. Then E is invertible. Moreover, its inverse 
is also an elementary matrix. 


Proof Let E be an elementary matrix. To show that E is invertible, we compute 
its determinant and apply Theorem 17 of Sec. 1.6. There are three cases depending 
on the form of E. First, if E is derived from J by an interchange of two rows, 
then det(Z) = —det(/) = —1. Second, if E is the result of multiplying one row 
of I by a nonzero scalar c, then det(E) = c det(7) = c 4 0. Third, if E is formed 
by adding a multiple of one row of J to another row, then det(E) = det(/) = 1. 
In either case, det(E) 40 and hence E is invertible. To show that ET! is an 
elementary matrix, we use the algorithm of Sec. 1.4 to compute the inverse. In this 
case starting with the n x 2n augmented matrix 


[E | J] 


we reduce the elementary matrix on the left (to 7) by applying the reverse operation 
used to form E, obtaining 
[71 £7] 


That E~! is also an elementary matrix follows from the fact that the reverse of 
each row operation is also a row operation. 


As an illustration of Theorem 20, let R be the row operation 2R2 + R; — Rj, 
which says to add 2 times row 2 to row 1. The corresponding elementary matrix is 
given by 


1 2 0 
E=|0 1 0 
0 0 1 


Since det(E) = 1, then E is invertible with 


l 


72 


Chapter 1 Systems of Linear Equations and Matrices 


THEOREM 21 


THEOREM 22 


Observe that E~! corresponds to the row operation R2: —2R2 + Rı —> Rı which 
says to subtract 2 times row 2 from row 1, reversing the original row operation R. 

Recall from Sec. 1.2 that an m x n matrix A is row equivalent to an m x n matrix 
B if B can be obtained from A by a finite sequence of row operations. Theorem 21 
gives a restatement of this fact in terms of elementary matrices. 


Let A and B be m x n matrices. The matrix A is row equivalent to B if and only if 
there are elementary matrices E1, E2, ..., Eg such that B = Ez E,x_|--- E,E\A. 


In light of Theorem 21, if A is row equivalent to B, then B is row equivalent to 
A. Indeed, if A is row equivalent to B, then 


B= Egz Ek-1 <- Ex E\A 


for some elementary matrices E1, E2,..., Ex. Successively multiplying both sides of 
this equation by Ee Be ..., and En we obtain 


A=E;'---E,!,E,'B 


Since each of the matrices EL E; L 22h ts Ez! is an elementary matrix, B is row 
equivalent to A. 

Theorem 22 uses elementary matrices to provide a characterization of invertible 
matrices. 


An n x n matrix A is invertible if and only if it can be written as the product of 
elementary matrices. 


Proof First assume that there are elementary matrices E1, E2, ..., Eg such that 
A= EEy Ek- Ex 


We claim that the matrix B = Ez! tee B Er is the inverse of A. To show this, 
we multiply both sides of A = E1 E2 -- - Ek-1 Ep by B to obtain 


BA = (E; ' --- EZ 'EȚA = (Ej, --- Ey 'EȚ CE E2- +: Ex-1 Ek) = I 


establishing the claim. On the other hand, suppose that A is invertible. In Sec. 1.4, 
we showed that A is row equivalent to the identity matrix. So by Theorem 21, 
there are elementary matrices E1, E2, ..., Ep such that I = EF, E,_,--- E2 E\A. 
Consequently, A = E;! tee EÑ 'I. Since EF er Ee and J are all elementary 
matrices, A is the product of elementary matrices as desired. 


LU Factorization 


There are many reasons why it is desirable to obtain an LU factorization of a matrix. 
For example, suppose that A is an m x n matrix and b;, with 1 <i < k, is a collection 
of vectors in R”, which represent outputs for the linear systems Ax = b;. Finding input 
vectors x; requires that we solve k linear systems. However, since the matrix A is the 


“NIX 


Figure 1 


Solution 


1.7 Elementary Matrices and LU Factorization 73 


same for each linear system, the process is greatly simplified if A is replaced with its 
LU factorization. The details for solving a linear system using an LU factorization 
are presented later in this section. If A is an n x n matrix with an LU factorization 


given by A = LU, then L and U are also n x n. See Fig. 1. Then by Theorem 13 of 
Sec. 1.6, the determinant of A is given by 


det(A) = (£i T Lnn) (u11 Pee Unn) 


where ¢;; and u;; are the diagonal entries of L and U, respectively. If this determinant 
is not zero, then by Theorem 9 of Sec. 1.4 the inverse of the matrix A is given by 


AT! = (LU)! = UTIL! 
To describe the process of obtaining an LU factorization of an m x n matrix A, 
suppose that A can be reduced to an upper triangular matrix by a sequence of row 


operations which correspond to lower triangular elementary matrices. That is, there 
exist lower triangular elementary matrices L1, L2,..., Lx such that 


LgLk-1 LIA =U 
Since each of the matrices L; with 1 < i < k is invertible, we have 
A = L'LL} LĮ'U 


By Theorem 20, Lr: Ly 1 ..., and Lr" are elementary matrices. They are also lower 
triangular. Now let L = | irs Oa tee i Observe that L is lower triangular as it is 
the product of lower triangular matrices. The desired factorization is thus given by 
A= LU. 


Find an LU factorization of the matrix 


3 © =3 
A= © ily =5 
= =2 6 


Observe that A can be row-reduced to an upper triangular matrix by means of 
the row operations Rı : 4R —> Ri, Ro: —6Rı + Ro — Ro, and R3: Rı + 
R3 —> R3. The corresponding elementary matrices are therefore given by 


5 00 100 100 
E, = | Oia E,=|-6 1 0 Ez=|0 1 0 
0 0 1 00 1 10 1 
respectively, so that 
100 1 0 4 0 0 3 6 = 
E;E,E;A=| 0 1 0 | | -6 10||O10 6 15 -5 
1 0 1 0 1 0 0 1 -1 -2 6 
12 -1 
=| 0 3 Si 
0 0 5 


74 Chapter 1 Systems of Linear Equations and Matrices 


THEOREM 23 


Solution 


a © =3 


[on 

e 

Nn 

| 

Nn 

ll 
= O W 
aa) 
= O © 
pamo D ao E a 
OUN 

— 


From the remarks preceding Example 2, we see that to use the above procedure 
with success, there are limitations on the matrix A. Specifically, A must be reducible to 
upper triangular form without any row interchanges. This will ensure that the elemen- 
tary matrices used in the elimination process will all be lower triangular. Theorem 23 
summarizes these results. 


Let A be an m x n matrix that can be reduced to the upper triangular matrix U 
without row interchanges by means of the m x m lower triangular matrices Lı, 
Lo,... Lp If L= Loi tee Tae then A has the LU factorization A = LU. 


A simple example of a matrix that cannot be reduced to upper triangular form 


O 1 


without interchanges is given by P = | 10 


f This matrix does not have an LU 
factorization. (See Exercise 29.) 


As stated in Theorem 23, it is not necessary for A to be square, as shown in 
Example 3. 


Find an LU factorization of the matrix 


Il =3 =2 0 
=i) i =2 I =l 


2 —4 3 

1 -3 -2 
Observe that A can be reduced to the upper triangular matrix U =} 0 1 3 =! 
0 1 4 

by means of the elementary matrices 

1 0 O 1 0 0 1 0 0 

jp i i o in = 0 1 0 i= || 0) 1 0 

0 0 1 —2 0 1 1 -2 1 


1.7 Elementary Matrices and LU Factorization 75 


Hence, 


1 0 0 0 I = =2 
0 0 1 0 0 1 3 = 
2 1 2 l 0 1 


= LU 


Solving a Linear System Using LU Factorization 


We now turn our attention to the process of solving a linear system by using an 
LU factorization. To illustrate the procedure, consider the linear system Ax = b with 


3 
b= 11 | and A the matrix of Example 2. By using the LU factorization of A 
9 


found in Example 2, 
AX = 6 15 —5 X2 = 11 


can be written equivalently as 


3 0 0 12 -1 x1 
LUx= 6 1 0 0 3 1 x2 |} =] Il 
-1 0 1 0 0 5 X3 
yı 
To solve this equation efficiently, we define the vector y = | y2 | by the equation 
y3 
Ux = y, so that 
1 2 -1 X1 yı 
0 3 1 X2 = y2 
00 5 X3 y3 


Making this substitution in the linear system L (Ux) = b gives 


3 0 0 yı 3 
6 1 0 y |=] 1 
-1 0 1 y3 9 


76 


Chapter 1 


Systems of Linear Equations and Matrices 


Using forward substitution, we solve the system Ly = b for y, obtaining yı = 1, 
y2 = 5, and y3 = 10. Next we solve the linear system Ux = y. That is, 


1 2 -1 xı 1 
0 3 x2 | = 5 
0 0 5 X3 10 


Using back substitution, we obtain x3 = 2, x2 = 1, and x; = 1. 
The following steps summarize the procedure for solving the linear system 
Ax = b when A admits an LU factorization. 


. Use Theorem 23 to write the linear system Ax = b as L(Ux) = b. 
. Define the vector y by means of the equation Ux = y. 
. Use forward substitution to solve the system Ly = b for y. 


hw nN = 


. Use back substitution to solve the system Ux = y for x. Note that x is the solution 
to the original linear system. 


PLU Factorization 


We have seen that a matrix A has an LU factorization provided that it can be row- 
reduced without interchanging rows. We conclude this section by noting that when 
row interchanges are required to reduce A, a factorization is still possible. In this 
case the matrix A can be factored as A = PLU, where P is a permutation matrix, 
that is, a matrix that results from interchanging rows of the identity matrix. As an 
illustration, let 


0 2 -2 
A=| 1 4 3 
1 2 0 
The matrix A can be reduced to 
1 2 0 
U=;0 2 3 
0 0 —-5 
by means of the row operations Ri: Ri = R3, R2: —Rı + Ra — Ro, and 
R3: —Rz + R3 — R3. The corresponding elementary matrices are given by 
0 0 1 1 0 0 1 0 0 
E£j=;0 1 0 E= | —1 1 0 and E3= | 0 1 0 
1 0 0 0 0 1 0 -1 1 


1.7 Elementary Matrices and LU Factorization 77 


Observe that the elementary matrix EF; is a permutation matrix while E2 and E} are 
lower triangular. Hence, 


Acia (E EF U 


0 0 1 1 0 0 12 0 
=/0 1 0 1 1 0 0 2 3 

1 0 0 O 1 1 0 0 —-5 
= PLU 


. A row operation on a matrix A can be performed by multiplying A by an 
elementary matrix. 


. An elementary matrix is invertible, and the inverse is an elementary matrix. 


. Ann x n matrix A is invertible if and only if it is the product of 
elementary matrices. 

. An m x n matrix A has an LU factorization if it can be reduced to an 
upper triangular matrix with no row interchanges. 

. If A= LU, then L is invertible. 

. An LU factorization of A provides an efficient method for solving Ax = b. 


Exercise Set 1.7 js 


In Exercises 1-4: b. Write A as the product of elementary matrices. 


a. Find the 3 x 3 elementary matrix E that performs 5. A= 1 3 
the row operation. i 


b. Compute EA, where 


—2 5 
122 1 eas | 2 5 | 
A=|3 1 2 
1 1 —4 1 2 -l 
7A=|]2 5 3 
1. 2R, + Ro — Ro 1 2 0 
2 ReR 
Peres -1 1 1 
3. —3R2 + Rz — R3 8. A= 3 1 0 
—2 1 1 
4. —R, + R — R3 
In Exercises 5-10: 0O 1 1l 
; : F 9 A=/1 2 3 
a. Find the elementary matrices required to 010 


reduce A to the identity. 


78 Chapter 1 Systems of Linear Equations and Matrices 


0001 
0010 
Ag. ay 00 
100 0 


In Exercises 11—16, find the LU factorization of the 
matrix A. 


1 -2 
11. A= | 7 
12. A = | -F 
z l 
1 2 1 
13. A= 2 5 5 
—3 -6 -2 
1 1 1 
14. A= | —1 0 -4 
22 2 3 
1 
1 at eS 
15. A= 1 3 j 
-1 -l 4 


1 3 ] 
= =F 57 
POG Hyde spe «38 
3 -6 3 10 
In Exercises 17—22, solve the linear system by using 
LU factorization. 


x+4y-—3z= 0 
—x — 3y + 5z = —3 
2x+8y—5z= 1 


x—2y+ z=-1 
20. 2x —3y+6z= 8 


—2x+4y- z= 4 


x—2y+3z+ w= 5 
x= y+5z+3w= 6 
2x —4y+7z+3w= 14 
x+ y—5z-2w=-8 


21. 


x+2y+2z- w= 5 
y+ z- w=-2 
ae —x—2y— z+4w= 1 
2x+2y+2z+2w= 1 
In Exercises 23 and 24, find the PLU factorization of 
the matrix A. 


0 1 -l 
23. A=] 2 -l1 0 
1 -3 2 
0 0 1 
24.. A=|2 1 1 
1 0 -3 


In Exercises 25—28, find the inverse of the matrix A 
by using an LU factorization. 


1 4 

25. A= | 23 1l 

1 7 
re ed 

21 -1 
27. A=|2 2 -2 

2 2 1 

—3 2 1 
28. A= 3 -1 1 

—3 1 0 

f : 0 1 
29. Show directly that the matrix A = | 10 does 


not have an LU factorization. 


30. Let A, B, and C be m x n matrices. Show that if 
A is row equivalent to B and B is row equivalent 
to C, then A is row equivalent to C. 


31. Show that if A and B are n x n invertible 
matrices, then A and B are row equivalent. 


32. Suppose that A is an n x n matrix with an LU 

factorization, A = LU. 

a. What can be said about the diagonal entries 
of L? 

b. Express det(A) in terms of the entries of L 
and U. 

c. Show that A can be row-reduced to U using 
only replacement operations. 


1.8 Applications of Systems of Linear Equations 79 


1.8 > Applications of Systems of Linear Equations 


Solution 


In the opening to this chapter we introduced linear systems by describing their con- 
nection to the process of photosynthesis. In this section we enlarge the scope of the 
applications we consider and show how linear systems are used to model a wide 
variety of problems. 


Balancing Chemical Equations 


Recall from the introduction to this chapter that a chemical equation is balanced if 
there are the same number of atoms, of each element, on both sides of the equation. 
Finding the number of molecules needed to balance a chemical equation involves 
solving a linear system. 


Propane is a common gas used for cooking and home heating. Each molecule of 
propane is comprised of 3 atoms of carbon and 8 atoms of hydrogen, written as 
C3Hs. When propane burns, it combines with oxygen gas, O2, to form carbon 
dioxide, CO2, and water, H20. Balance the chemical equation 


C3Hg + O2 —> CO: + H20 
that describes this process. 
We need to find whole numbers x1, x2, x3, and x4, so that the equation 
x1C3Hg + x202 —> x3CO2 + x4H2O 


is balanced. Equating the number of carbon, hydrogen, and oxygen atoms on both 
sides of this equation yields the linear system 

3x1 =X =0 

8x 1 = 2x4 =0 


2x2 — 2x3 — a0) 
Solving this system, we obtain the solution set 


ies 2 
Ss -t, —t, -t,t || tEeR 
aa A 


Since whole numbers are required to balance the chemical equation, particular solu- 
tions are obtained by letting t=0,4,8,.... For example, if t= 8, then 
xı = 2, x2 = 10, x3 = 6, and x4 = 8. The corresponding balanced equation is given by 


2C3Hs + 1002 —> 6CO2 + 8H20 


Network Flow 


To study the flow of traffic through city streets, urban planners use mathematical 
models called directed graphs or digraphs. In these models, edges and points are 
used to represent streets and intersections, respectively. Arrows are used to indicate 


80 Chapter 1 Systems of Linear Equations and Matrices 


the direction of traffic. To balance a traffic network, we assume that the outflow of 
each intersection is equal to the inflow, and that the total flow into the network is 
equal to the total flow out. 


| EXAMPLE2 | Partial traffic flow information, given by average hourly volume, is known about 


a network of five streets, as shown in Fig. 1. Complete the flow pattern for the 
network. 


300 < + < 500 
A Y 300 
200> = > 400 


< 600 


500 
Figure 1 


Solution To complete the traffic model, we need to find values for the eight unknown flows, 
as shown in Fig. 2. 


X1 100 


300 500 
200 400 
X 
$ 600 
X5 
500 
Figure 2 


Our assumptions about the intersections give us the set of linear equations 


X2 + X6 = 300 + xı 
100 +500 = x6 + 300 
200 + x3 = 3D sp a 


300 + x7 = 400 + x4 
400+ 500 = x3 + xg 
x4 + 600 = 400 + x5 


1.8 Applications of Systems of Linear Equations 81 


Moreover, balancing the total flow into the network with the total flow out gives 
us the additional equation 


500 + 600 + 500 + 200 + 100 = 400 + x5 + xg + 300 + xı 
The final linear system is 


—x,; +x + x6 = 300 
X6 = 300 

20) = 03 SP oF = 200 

= eA SP AF = 100 

X3 +xg= 900 

—X4+ x5 = 200 

X1 + X5 + xg = 1200 


The solution is given by 
xı = 1100 — s — t x2 = 1100 — s — t x3 = 900 — t x4 = —100 + s 
xs = 100 + s x6 = 300 NEAS Kg = i 


Notice that x7 and xg are free variables. However, to obtain particular solutions, 
we must choose numbers for s and ¢ that produce positive values for each x; in the 
system (otherwise we will have traffic going in the wrong direction!) For example, 
s = 400 and t = 300 give a viable solution. 


Nutrition 


Designing a healthy diet involves selecting foods from different groups that, when 
combined in the proper amounts, satisfy certain nutritional requirements. 


Table 1 gives the amount, in milligrams (mg), of vitamin A, vitamin C, and calcium 
contained in | gram (g) of four different foods. For example, food 1 has 10 mg of 
vitamin A, 50 mg of vitamin C, and 60 mg of calcium per gram of food. Suppose 
that a dietician wants to prepare a meal that provides 200 mg of vitamin A, 250 
mg of vitamin C, and 300 mg of calcium. How much of each food should be used? 


Table 1 


Food 2 | Food 3 | Food 4 


wm [co | 20 | wo | 25 


82 


Chapter 1 Systems of Linear Equations and Matrices 


Solution Let x1, x2, x3, and x4 denote the amounts of foods 1 through 4, respectively. The 
amounts for each of the foods needed to satisfy the dietician’s requirement can be 
found by solving the linear system 

10x; + 30x2 + 20x3 + 10x4 = 200 
50x1 + 30x2 + 25x3 + 10x4 = 250 
60x; + 20x2 + 40x3 + 25x4 = 300 


Rounded to two decimal places, the solution to the linear system is given by 


xı = 0.63 +0.11¢ x2 = 3.13 + 0.24t 
x3 = 5 — 0.92t a =i 


Observe that each of these values must be nonnegative. Hence, particular solutions 
can be found by choosing nonnegative values of t such that 


0 < 5 — 0.92t 


Isolating ¢ gives 
5 
t<—— 54 
~ O82 


Economic Input-Output Models 


Constructing models of the economy is another application of linear systems. In a real 
economy there are tens of thousands of goods and services. By focusing on specific 
sectors of the economy the Leontief input-output model gives a method for describing 
a simplified, but useful model of a real economy. For example, consider an economy 
for which the outputs are services, raw materials, and manufactured goods. Table 2 
provides the inputs needed per unit of output. 


Table 2 


Services | Raw materials | Manufacturing 


Here to provide $1.00 worth of service, the service sector requires $0.04 worth of 
services, $0.05 worth of raw materials, and $0.02 worth of manufactured goods. The 
data in Table 2 are recorded in the matrix 


0.04 0.05 0.02 
A= | 0.03 0.04 0.04 
0.02 0.3 0.2 


Solution 


1.8 Applications of Systems of Linear Equations 83 


This matrix is called the input-output matrix. The demand vector D gives the total 
demand on the three sectors, in billions of dollars, and the production vector x, also 
in billions of dollars, contains the production level information for each sector. Each 
component of Ax represents the level of production that is used by the corresponding 
sector and is called the internal demand. 

As an example, suppose that the production vector is 


200 
x= | 100 
150 
Then the the internal demand is given by 
0.04 0.05 0.02 200 16 
Ax= | 0.03 0.04 0.04 100 | = | 16 
0.02 0.3 0.2 150 64 


This result means that the service sector requires $16 billion of services, raw materi- 
als, and manufactured goods. It also means that the external demand cannot exceed 
$184 billion of services, $84 billion of raw materials, and $86 billion in manufactured 
goods. 

Alternatively, suppose that the external demand D is given. We wish to find a 
level of production for each sector such that the internal and external demands are 
met. Thus, to balance the economy, x must satisfy 


x—Ax=D 
that is, 
(J —A)x=D 
When / — A is invertible, then 
x=(1—A)'D 


Suppose that the external demand for services, raw materials, and manufactured 
goods in the economy described in Table 2 is given by 


300 
D= | 500 
600 


Find the levels of production that balance the economy. 


From the discussion above we have that the production vector x must satisfy 


(I -Ax=D 


84 Chapter 1 Systems of Linear Equations and Matrices 
that is, 
0.96 —0.05 —0.02 xX] 300 
—0.03 0.96 —0.04 x2 || = || 500 
—0.02 —0.3 0.8 X3 600 


Since the matrix on the left is invertible, the production vector x can be found by 
multiplying both sides by the inverse. Thus, 


xı 1.04 0.06 0.03 300 
x2 | = | 0.03 1.06 0.05 500 
X3 0.04 0.4 1.27 600 
360 
~ | 569 
974 


So the service sector must produce approximately $360 billion worth of ser- 
vices, the raw material sector must produce approximately $569 billion worth of raw 
materials, and the manufacturing sector must produce approximately $974 billion 
worth of manufactured goods. 


Exercise Set 1.8 


In Exercises 1—4, use the smallest possible positive 
integers to balance the chemical equation. 
1. When subjected to heat, aluminium reacts with 
copper oxide to produce copper metal and 
aluminium oxide according to the equation 


Al; + CuO —> Al,O3 + Cu 


Balance the chemical equation. 

2. When sodium thiosulfate solution is mixed with 
brown iodine solution, the mixture becomes 
colorless as the iodine is converted to colorless 
sodium iodide according to the equation 


I, + Na2S20;3 —> Nal + Naz S406 
Balance the chemical equation. 


3. Cold remedies such as Alka-Seltzer use the 
reaction of sodium bicarbonate with citric acid in 


solution to produce a fizz (carbon dioxide gas). 


The reaction produces sodium citrate, water, and 
carbon dioxide according to the equation 


NaHCO; + C6Hg07 = 

Na3Ce6Hs07 + H20 + CO2 
Balance the chemical equation. For every 100 mg 
of sodium bicarbonate, how much citric acid 


should be used? What mass of carbon dioxide will 
be produced? 


. Balance the chemical equation 


MnS + As2Cr10035 + H2SO4 — 
HMn0O, + AsH; + CrS30)2 + H20 


. Find the traffic flow pattern for the network in the 


figure. Flow rates are in cars per hour. Give one 
specific solution. 


1.8 Applications of Systems of Linear Equations 85 


300 8. Find the traffic flow pattern for the network in the 
figure. Flow rates are in cars per half-hour. What 
is the smallest possible value for xg? 


800 500 
150 100 
300 200 
300 
700 
6. Find the traffic flow pattern for the network in the 
figure. Flow rates are in cars per hour. Give one 100 200 
specific solution. 
100 ; iis 
9. The table lists the number of milligrams of 
vitamin A, vitamin B, vitamin C, and niacin 
iod s00 contained in 1 g of four different foods. A 


dietician wants to prepare a meal that provides 
250 mg of vitamin A, 300 mg of vitamin B, 400 
500 300 mg of vitamin C, and 70 mg of niacin. Determine 
Pa how many grams of each food must be included, 
ae and describe any limitations on the quantities of 
each food that can be used. 


200 


500 
Group 1 | Group 2 

7. Find the traffic flow pattern for the network in the Vitamin A 20 at 

figure. Flow rates are in cars per half-hour. What Vitamin B 40 20 

is the current status of the road labeled x5? Vitamin C 50 40 

150 

Niacin 5 5 

X1 X4 
10. The table lists the amounts of sodium, potassium, 
100 50 carbohydrates, and fiber in a single serving of 
three food groups. Also listed are the daily 
x2 X3 recommended amounts based on a 2000-calorie 


diet. Is it possible to prepare a diet using the three 
food groups alone that meets the recommended 


amounts? 
100 


86 Chapter 1 Systems of Linear Equations and Matrices 


Gami | eama | eap aaeei a. Write the input-output matrix A for the 


cee 

b. IF the levels of production, in billion, of the 
three sectors of the economy are 300, 150, and 
200, respectively, find the internal demand 


vector for the economy. What is the total 
external demand that can be met by the three 
sectors? 


11. An economy is divided into three sectors as 
described in the table. Each entry represents the 
number of units required by the sector to produce 
1 unit of output. 


c. Find the inverse of the matrix J — A. 


d. If the external demands on the three sectors 
Services | Raw materials | Manufacturing are 350, 400, and 600, respectively, determine 


economy. 


12. Economies are, in general, very complicated with many sectors. The input-output 
matrix A is based on grouping the different industries and services into 10 separate 
sectors. If the external demands to the sectors are given in the vector D, determine 
the levels of production that balance the economy. 


0.041 0.032 0.018 0.041 0.009 0.002 0.039 0.048 0.04 0.021 45 
0.023 0.037 0.046 0.011 0.004 0.024 0.041 0.006 0.004 0.007 10 
0.018 0.03 0.039 0.05 0.038 0.011 0.049 0.001 0.028 0.047 11 
034 0.005 0.034 0.039 0.023 0.007 0.009 0.023 0.05 0.006 17 
ek 0.022 0.019 0.021 0.009 0.007 0.035 0.044 0.023 0.019 0.019 D= 48 
0.044 0.005 0.02 0.006 0.013 0.005 0.032 0.016 0.047 0.02 32 
0.018 0.001 0.049 0.011 0.043 0.003 0.024 0.047 0.027 0.042 42 
0.026 0.004 0.03 0.015 0.044 0.021 0.01 0.004 0.011 0.044 21 
0.01 0.011 0.039 0.025 0.005 0.029 0.024 0.023 0.021 0.042 34 
0.048 0.03 0.019 0.045 0.044 0.033 0.014 0.03 0.042 0.05 40 
13. The table contains estimates for national health a. Make a scatter plot of the data. 
care in billions of dollars. b. Use the 1970, 1980, and 1990 data to write a 


system of equations that can be used to find a 


Year | Dollars (billions) parabola that approximates the data. 
E m c. Solve the system found in part (b). 
d. Plot the parabola along with the data points. 

BIS 3o e. Use the model found in part (c) to predict an 

1975 120 estimate for national health care spending in 

1980 250 2010. 

1985 400 : 
-—+————— 14. The number of cellular phone subscribers 

1220 SA worldwide from 1985 to 2002 is given in the 


table. Use the data from 1985, 1990, and 2000 to 
fit a parabola to the data points. Use the quadratic 
function to predict the number of cellular phone 
subscribers expected in 2010. 


Year Cellular Phone 
Subscribers(millions) 

1985 1 

1990 11 

2000 741 

2001 955 

2002 1155 


In Exercises 15-18, use the power of a matrix to solve 
the problems. That is, for a matrix A, the nth power is 


A” =A-A-A--A 
— ee 


n times 


15. Demographers are interested in the movement of 


populations or groups of populations from one 
region to another. Suppose each year it is 
estimated that 90 percent of the people of a city 
remain in the city, 10 percent move to the suburbs, 
92 percent of the suburban population remain in 
the suburbs, and 8 percent move to the city. 


a. Write a 2 x 2 transition matrix that describes 
the percentage of the populations that move 
from city to city (remain in the city), city to 
suburbs, suburbs to suburbs (remain in the 
suburbs), and suburbs to city. 


b. If in the year 2002 the population of a city was 
1,500,000 and of the suburbs was 600,000, 
write a matrix product that gives a 2 x 1 vector 
containing the populations in the city and in 
the suburbs in the year 2003. Multiply the 
matrices to find the populations. 


c. If in the year 2002 the population of a city was 
1,500,000 and of the suburbs was 600,000, 
write a matrix product that gives a 2 x 1 vector 
containing the populations in the city and in 


1.8 Applications of Systems of Linear Equations 87 


16. 


17. 


18. 


the suburbs in the year 2004. Multiply the 
matrices to find the populations. 


d. Give a matrix product in terms of powers of 


the matrix found in part (a) for the size of the 
city and suburban populations in any year after 
2002. 


To study the spread of a disease, a medical 
researcher infects 200 laboratory mice of a 
population of 1000. The researcher estimates that 
it is likely that 80 percent of the infected mice 
will recover in a week and 20 percent of healthy 
mice will contract the disease in the same week. 


a. Write a 2 x 2 matrix that describes the 
percentage of the population that transition 
from healthy to healthy, healthy to infected, 
infected to infected, and infected to healthy. 


b. Determine the number of healthy and infected 


mice after the first week. 


c. Determine the number of healthy and infected 
mice after the second week. 


d. Determine the number of healthy and infected 


mice after six weeks. 


In a population of 50,000 there are 20,000 
nonsmokers, 20,000 smokers of one pack or less a 
day, and 10,000 smokers of more than one pack a 
day. During any month it is likely that only 10 
percent of the nonsmokers will become smokers 
of one pack or less a day and the rest will remain 
nonsmokers, 20 percent of the smokers of a pack 
or less will quit smoking, 30 percent will increase 
their smoking to more than one pack a day, 30 
percent of the heavy smokers will remain smokers 
but decrease their smoking to one pack or less, 
and 10 percent will go cold turkey and quit. After 
one month what part of the population is in each 
category? After two months how many are in 
each category? After one year how many are in 
each category? 


An entrepreneur has just formed a new company 
to compete with the established giant in the 
market. She hired an advertising firm to develop a 


88 Chapter 1 


campaign to introduce her product to the market. 
The advertising blitz seems to be working, and in 
any given month 2 percent of the consumers 
switch from the time-honored product to the new 
improved version, but at the same time 5 percent 
of those using the new product decide to switch 
back to the old established brand. How long will 
it take for the new company to acquire 20 percent 
of the consumers? 


In Exercises 19 and 20, the figure shows an electrical 
network. In an electrical network, current is measured 
in amperes, resistance in ohms, and the product of 
current and resistance in volts. Batteries are 
represented using two parallel line segments of 
unequal length, and it is understood the current flows 
out of the terminal denoted by the longer line segment. 
Resistance is denoted using a sawtooth. To analyze an 
electrical network requires Kirchhoff’s laws, which 
state all current flowing into a junction, denoted using 
a black dot, must flow out and the sum of the products 
of current J and resistance R around a closed path (a 
loop) is equal to the total voltage in the path. 


19. a. Apply Kirchhoff’s first law to either junction to 
write an equation involving 71, h, and Js. 
b. Apply Kirchhoff’s second law to the two loops 
to write two linear equations. 


c. Solve the system of equations from parts 
(a) and (b) to find the currents 71, h, and J. 


Systems of Linear Equations and Matrices 


20. a. Apply Kirchhoff’s first law to the four 
junctions to write four equations involving 
currents. 

b. Apply Kirchhoff’s second law to the three 
loops to write three linear equations. 

c. Solve the system of equations from parts 
(a) and (b) to find the currents [,, 2, 13, I4, Is, 
and I¢. 


i 14V 


18V 
Is 
Re =3 


Ry =6 16V 


In Exercises 21 and 22, use the fact that if a plate has 
reached a thermal equilibrium, then the temperature at 
a grid point, not on the boundary of the plate, is the 
average of the temperatures of the four closest grid 
points. The temperatures are equal at each point on a 
boundary, as shown in the figure. Estimate the 
temperature at each interior grid point. 


21. 30 


22. 30 


1.8 Applications of Systems of Linear Equations 89 


Review Exercises for Chapter 1 


1. Consider the linear system 


x+ y+2z7+ w= 3 
—x + z+2w= 1 
2x + 2y + w=-2 
x+ y+2z+3w= 5 


a. Define the coefficient matrix A for the linear 
system. 


. Find det(A). 
. Is the linear system consistent? Explain. 


. Find all solutions to Ax = 0. 


o aa T 


inverse. 


f. Solve the linear system. 
2. The augmented matrix of a linear system has the 


form 
1 —1 2 lia 
—1 3 1 1b 
3 -5 5 Ilc 
2 —2 4 2/d 


a. Can you decide by inspection whether the 
determinant of the coefficient matrix is 0? 
Explain. 


b. Can you decide by inspection whether the 
linear system has a unique solution for every 
choice of a, b, c, and d? Explain. 


c. Determine the values of a, b, c, and d for 
which the linear system is consistent. 


d. Determine the values of a, b, c, and d for 
which the linear system is inconsistent. 


e. Does the linear system have a unique solution 
or infinitely many solutions? 


f. Ifa = 2,b = 1,c = —1, and d = 4, describe 
the solution set for the linear system. 


. Is the matrix A invertible? If yes, then find the 


3. Find all idempotent matrices of the form 
a b 
0 c 
4. Let S denote the set of all 2 x 2 matrices. Find all 


: b : . 
matrices | ° d | that will commute with every 


matrix in S. 
5. Let A and B be 2 x 2 matrices. 


a. Show that the sum of the terms on the main 
diagonal of AB — BA is 0. 


b. If M is a 2 x 2 matrix and the sum of the main 
diagonal entries is 0, show there is a constant c 
such that 

M*=cl 


c. If A, B, and C are 2 x 2 matrices, then use 
parts (a) and (b) to show that 


(AB — BA)*C = C(AB — BA} 


6. Find the traffic flow pattern for the network in the 
figure. Flow rates are in cars per hour. Give one 
specific solution. 


300 500 
200 400 
400 


600 


500 


90 Chapter 1 Systems of Linear Equations and Matrices 


7. a. Explain why the matrix 


1 1 1 1 1 
O 1 1 1 1 
A=|0 0 1 1 1 
0 0 0 I 1I 
0 0 0 0 1 


is invertible. 


b. Determine the maximum number of 1’s that 
can be added to A such that the resulting 
matrix is invertible. 


8. Show that if A is invertible, then A’ is invertible 
and (A‘)~! = (AT! Y. 


9. 


10. 


A matrix A is skew-symmetric provided A’ = —A. 
a. Let A be ann x n matrix and define 


B=A+A' and C=A-A' 


Show that B is symmetric and C is 
skew-symmetric. 


b. Show that every n x n matrix can be written as 


the sum of a symmetric and a skew-symmetric 
matrix. 


Suppose u and v are solutions to the linear system 
Ax = b. Show that if scalars a and 6 satisfy 

a+ B = 1, then au + fv is also a solution to the 
linear system Ax = b. 


Chapter 1: Chapter Test 


In Exercises 1—45, determine whether the statement is 
true or false. 


1. A 2 x 2 linear system has one solution, no 
solutions, or infinitely many solutions. 


2. A 3 x 3 linear system has no solutions, one 
solution, two solutions, three solutions, or 
infinitely many solutions. 


3. If A and B are n x n matrices with no zero 
entries, then AB + 0. 


4. Homogeneous linear systems always have at least 
one solution. 


5. If A is an n x n matrix, then Ax = 0 has a 
nontrivial solution if and only if the matrix A has 
an inverse. 


6. If A and B are n x n matrices and Ax = Bx for 
every n x 1 matrix x, then A = B. 


7. If A, B, and C are invertible n x n matrices, then 
(ABC)! =A7'B'c"}” 


8. If A is an invertible n x n matrix, then the linear 
system Ax = b has a unique solution. 


9. If A and B aren x n invertible matrices and 
AB = BA, then A commutes with B~!. 


10. If A and B commute, then A?B = BA?. 


11 


12. 


13. 


14. 


15. 


16. 


17. 


18. 


The matrix 
1 —2 3 1 0 
0 -1 4 3 2 
0 03 5 -2 
0 0 0 0 4 
0 0 0 0 6 


does not have an inverse. 


Interchanging two rows of a matrix changes the 
sign of its determinant. 


Multiplying a row of a matrix by a nonzero 
constant results in the determinant being 
multiplied by the same nonzero constant. 


If two rows of a matrix are equal, then the 
determinant of the matrix is 0. 


Performing the operation aR; + Rj; —> Rj ona 
matrix multiplies the determinant by the 
constant a. 


ak 2 2 = 
ra=] o | then a? = 74 = 20 
If A and B are invertible matrices, then A + B is 


an invertible matrix. 


If A and B are invertible matrices, then AB is an 
invertible matrix. 


19. 


20. 


21. 


22. 


23. 


24. 
25. 


If A is an n x n matrix and A does not have an 
inverse, then the linear system Ax = b is 
inconsistent. 


The linear system 
B23 x 1 
6 5 4 y | =] 2 
0 0 0 Z 3 


is inconsistent. 


The inverse of the matrix 
2 —l1 : 1 1 
3 1 1S -3 2 
The matrix 
2 —l1 
TE 


does not have an inverse. 

If the n x n matrix A is idempotent and 
invertible, then A = 7. 

If A and B commute, then A’ and B’ commute. 


If A is an n x n matrix and det(A) = 3, then 
det(A A) = 9. 


In Exercises 26—32, use the linear system 


26. 


27. 


28. 


29. 


30. 


31. 


2x+2y=3 
x- y=1 
The coefficient matrix is 


eet 


The coefficient matrix A has determinant 


det(A) = 0 
The linear system has a unique solution. 


The only solution to the linear system is 
x = —7/4 and y = —5/4. 
The inverse of the coefficient matrix A is 


1 1l 
-1_| 4 2 
eed (tee 

4 2 


The linear system is equivalent to the matrix 


UGB 


1.8 Applications of Systems of Linear Equations 91 


32. The solution to the system is given by the matrix 


equation 


Al BIR 


In Exercises 33—36, use the linear system 


36. 


xı +2x. — 3x3 = 1 
2x1 + 5x2 — 8x3 = 4 
—2x]ı — 4x2 + 6x3 = —2 


. The determinant of the coefficient matrix is 


5 —8 
—4 6 


2 —8 
—2 6 


2 5 
=u =4 


d 


7 


. The determinant of the coefficient matrix is 0. 


35. 


A solution to the linear system is 
xı = —4, x. = 0, and x3 = —1. 


The linear system has infinitely many solutions, 
and the general solution is given by x3 is free, 
x2 = 2 + 2x3, and x} = —3 — x3. 


In Exercises 37—41, use the matrix 


37. 


38. 


39. 


-1 —2 1 
A= 1 0 1 -i 
2 1 2 
After the operation R; <—> R> is performed, the 
matrix becomes 


After the operation —2R; + R3 —> R3 is 
performed on the matrix found in Exercise 37, the 
matrix becomes 


1 0 1 -1 
al. A 3 
0 -2 0 -3 


The matrix A is row equivalent to 


1 0 1 -1 
0 -2 2 2 
0 0 1 4 


92 Chapter 1 Systems of Linear Equations and Matrices 


40. The reduced row echelon form of A is 


1 0 0 -5 
0 1 0 3 
0 0 1 4 


41. If A is viewed as the augmented matrix of a 
linear system, then the solution to the linear 
system is x = —5, y= 3, and z = 4. 


In Exercises 42—45, use the matrices 


by, ED 
A=| -2 3 1 
4 0 -3 
f121 
Tir 32 


42. 


43. 


44. 


45. 


The matrix products AB and BA are both defined. 


The matrix expression —2BA + 3B simplifies to a 
2 x 3 matrix. 


The matrix expression —2BA + 3B equals 
-3 -5 3 
—5 7 16 


The matrix A? is 


74 -3 
-4 7 —4 
-8 4 17 


CHAPTER OUTLINE 


Linear Combinations 
and Linear Independence 


2.1 Vectorsin R” 94 
2.2 Linear Combinations 101 
2.3 Linear Independence 111 


I the broadest sense a signal is any time- 
varying quantity. The motion of a particle 
through space, for example, can be thought of 
as a signal. A seismic disturbance is detected 
as signals from within the earth. Sound caused 
by the vibration of a string is a signal, radio 
waves are signals, and a digital picture with col- 
ors represented numerically also can be consid- 
ered a signal. A video signal is a sequence of 
images. Signals represented using real numbers 
are called continuous while others that use inte- 
gers are called discrete. A compact disc contains 
discrete signals representing sound. Some signals 
are periodic; that is, the waveform or shape of the 
signal repeats at regular intervals. The period of 
a wave is the time it takes for one cycle of the 
wave, and the frequency is the number of cycles 
that occur per unit of time. If the period of a wave 
is 2T, then the frequency is F = F- Every periodic motion is the mixture of 
sine and cosine waves with frequencies proportional to a common frequency, called 
the fundamental frequency. A signal with period 2T is a mixture of the func- 
tions 


TX | TX 27% . 270X Ink . 3x 
1, cos —, sin —, cos ——, sin ——, cos ——, sin —.,... 
T T T T T T 


and for any n, the signal can be approximated by the fundamental set 
TX 2mx | 2mx nTX nax 


TX 
1, cos —, sin —, cos ——, sin ——,..., cos ——, sin — 
T T T T 


93 


94 


Chapter 2 Linear Combinations and Linear Independence 


2.1 > 


The approximation obtained from the sum of the elements of the fundamental set with 
appropriate coefficients, or weights, has the form 


4 TX P TX 4 2Tx 
ao + a, cos — sin — + a) cos —— 
0 1 T 1 T 2 T 


_ 2x nax . NTX 
tbzsin—z +-+ + dn cos- + basin = 


This sum is called a linear combination of the elements of the fundamental set. A 
square wave on the interval [~ x, x] along with the approximations 


4, : 4, 4, 4, 4, 
— sinx, — sin x + — sin3x, —sinx + — sin 3x + — sin5x 
T T 3m T T Su 


di 4 4 4 
— sinx + — sin 3x + — sin 5x + — sin 7x 
T 3m Sa Tx 


are shown in Fig. 1. As more terms are added, the approximations become better. 


AY Ay 


1 
Square wave 


=y 
av 


Figure 1 


In Chap. 1 we defined a vector, with n entries, as an n x | matrix. Vectors are 
used not only in mathematics, but in virtually every branch of science. In this chapter 
we study sets of vectors and analyze their additive properties. The concepts presented 
here, in the context of vectors, are fundamental to the study of linear algebra. In 
Chap. 3, we extend these concepts to abstract vector spaces, including spaces of 
functions as described in the opening example. 


Vectors in R” 


Euclidean 2-space, denoted by R2, is the set of all vectors with two entries, that is, 


={[3 


X1, X2 are real numbers} 


DEFINITION 1 


Ay 
(1, 2) 
> 
x 
Figure 2 
Ad 
J: 
li 
a 


Figure 3 


DEFINITION 2 


tad 


2.1 Vectors in R” 95 


Similarly Euclidean 3-space, denoted by R?, is the set of all vectors with three entries, 
that is, 
X1 
R? = x2 X1, X2, X3 are real numbers 
X3 
In general, Euclidean n-space consists of vectors with n entries. 


VectorsinR” Euclidean n-space, denoted by R”, or simply n-space, is defined by 


x} 
n_ x2 , T 
R” = ; xi E€ R, fori=1,2,...,n 


Xn 


The entries of a vector are called the components of the vector. 


Geometrically, in R? and R? a vector is a directed line segment from the origin to 
the point whose coordinates are equal to the components of the vector. For example, 
the vector in R? given by 

1 
v= 


is the directed line segment from the origin (0,0) to the point (1,2), as shown in 
Fig. 2. The point (0,0) is the initial point, and the point (1,2) is the terminal 
point. The length of a vector is the length of the line segment from the initial point 


to the terminal point. For example, the length of v = | ; | is VIZ +2 = v5. A 


vector is unchanged if it is relocated elsewhere in the plane, provided that the length 
and direction remain unchanged. For example, the directed line segments between 
(0, 0) and (1,2) and between (2,2) and (3,4) are both representations of the same 


vector V = > |: See Fig. 3. When the initial point of a vector is the origin, we say 


that the vector is in standard position. 

Since vectors are matrices, two vectors are equal provided that their corresponding 
components are equal. The operations of addition and scalar multiplication are defined 
componentwise as they are for matrices. 


Addition and Scalar Multiplication of Vectors Let u and v be vectors in 
R” and c a scalar. 


1. The sum of u and v is 


uy v1 ui +v 


u2 v2 uz + v2 
u+v= 4 + z = è 


Un Un Un + Un 


96 Chapter 2 Linear Combinations and Linear Independence 


Solution 


2. The scalar product of c and u is 


uy Cu 

u2 cuz 
cu=c : = ` 

Un Cün 


These algebraic definitions of vector addition and scalar multiplication agree with 
the standard geometric definitions. Two vectors u and v are added according to the 
parallelogram rule, as shown in Fig. 4(a). The vector cu is a scaling of the vector u. 
In Fig. 4(b) are examples of scaling a vector with O < c < 1 and c > 1. In addition, 
if c < 0, then the vector cu is reflected through the origin, as shown in Fig. 4(b). The 
difference u — v = u + (—v) is the vector shown in Fig. 4(c). As shown in Fig. 4(c), 
it is common to draw the difference vector u — v from the terminal point of v to the 
terminal point of u. 


Let 
1 -1 4 
t= || =2 v= 4 and W= || 2 
3 3 6 


Find (2u + v) — 3w. 


Using the componentwise definitions of addition and scalar multiplication, we have 


1 —1 4 
(u +v)—-3w=|2| -2 |+ 4 -3| 2 
3 3 6 
2 —1 —12 
= —4 | + 4 ale —6 
6 3 —18 
1 —12 —11 
=| 0O;+ —6 | = —6 
9 —18 —9 
AY Ay Ay 


(a) (c) 


Figure 4 


Solution 


THEOREM 1 


2.1 Vectors in R” 97 


Vectors in R”, being matrices with n rows and 1 column, enjoy all the algebraic 
properties of matrices that we saw in Chap. 1. 


Show that vector addition is commutative. 


If u and v are vectors in R”, then 


u v1 ui +v 

u2 v2 uz + v2 
u+v= à + ; = 

Un Un Un IF Un 


Since addition for real numbers is commutative, 


uy + v1 vı + Uy 
u2 + v2 v2 + u2 

u+v= : = é =v+u 
Un + Up Un + Un 


The zero vector in R” is the vector with each component equal to 0, that is, 


0 
Hence, for any vector v in R”, we have v + 0 = v. Recall that for any real number a 


there is a unique number —a such that a + (—a) = 0. This enables us to define the 
additive inverse of any vector v as the vector 


so that v+ (—v) = 0. 

Theorem 1 summarizes the essential algebraic properties of vectors in R”. These 
properties serve as a model for the structure we will require for the abstract vec- 
tor spaces of Chap. 3. The first of these properties was proved in Example 2. The 
remaining justifications are left as exercises. 


Let u, v, and w be vectors in R”, and let c and d be scalars. The following algebraic 
properties hold. 


1. Commutative property: u+v=v+u 
2. Associative property: (u+v)+w=u+(v+w) 


98 Chapter 2 Linear Combinations and Linear Independence 


3. Additive identity: The vector 0 satisfies 0 + u = u + 0 = u. 


4. Additive inverse: For every vector u, the vector —u satisfies 
u + (—u) =—u +u = 0. 


5. c(u + v) = cu + cv 
6. (c + d)u = cu + du 
7. c(du) = (cd)u 

8. (ju=u 


By the associative property, the vector sum u; + u2 + --- + u, can be computed 
unambiguously, without the need for parentheses. This will be important in Sec. 2.2. 


1 2 4 


melee | 


Verify that the associative property holds for these three vectors. Also verify that 
for any scalars c and d, c(du) = (cd)u. 


Solution To verify the associative property, we have 


(ut+v)+w= (| E | + 


and 


Hence, (u + v) + w = u + (yv + w). 
For the second verification, we have 


wan =e(a] i eel a e fea elites 


The properties given in Theorem 1 can be used to establish other useful properties 
of vectors in R”. For example, if u € R” and c is a scalar, then 


ui 0 
U? 0 
Ou = 0 : = . =0 and c0=0 


Un 0 


2.1 Vectors in R” 99 


We also have the property that (—1)u = —u. That is, the scalar product of —1 with 
u is the additive inverse of u. In the case of real numbers, the statement xy = 0 is 
equivalent to x = 0 or y = 0. A similar property holds for scalar multiplication. That 
is, if cu = 0, then either c = 0 or u= 0. To see this, let 


cu 0 

cuz 0 

Cun 0 
so that cu; = 0, cu2 = 0, ..., cu, = 0. If c = 0, then the conclusion holds. Other- 
wise, U] = U2 =-:: = Un = O, that is, u = 0. 


Fact Summary 


1. The definitions of vector addition and scalar multiplication in R” agree with 
the definitions for matrices in general and satisfy all the algebraic properties 
of matrices. 

. The zero vector, whose components are all 0, is the additive identity for 
vectors in R”. The additive inverse of a vector v, denoted by —v, is 
obtained by negating each component of v. 


. For vectors in R? and R? vector addition agrees with the standard 
parallelogram law. Multiplying such a vector by a positive scalar changes 
the length of the vector but not the direction. If the scalar is negative, the 
vector is reflected through the origin. 


Exercise Set 2.1 


In Exercises 1—6, use the vectors 5. Find —3(u+ v) — w. 


1 —2 6. Find 2u — 3(v — 2w). 
u=| -2 v= 4 
3 0 In Exercises 7—10, use the vectors 


1. Findu+vandv+u. 
7. Find —2(u + 3v) + 3u. 
2. Find (u + v) + w and u + (v + w). 
8. Find 3u — 2v. 
3. Find u — 2v + 3w. 
9. If x; and x are real scalars, verify that 


4. Find —u + iv — 2w. (x1 + x2)U = x1 U + XU. 


100 Chapter 2 Linear Combinations and Linear Independence 


10. If x; is a real scalar, verify that 2 =j 0 
xı (u + v) = xu + xX Vv. 18. cı 5 tcez ret E 
In Exercises 11—14, let 
1 0 aleea e 
e = 0 €&€ = 1 
9 2 20 —1 2- | = 
0 - Cl 3 + c2 lF 1 
€3 = 0 
1 —4 0 —5 —3 
Write the given vector in terms of the vectors e1, e2, aa : +e tos ; 2 a 
and e3. = E 
2 1 1 -1 
11. v= 4 22. c1 —1 + c2 1 + 63 1 = 0 
1 0 — -1 
—1 —1 —1 1 —1 
12. v= 3 23. cı 0 +e 1 +6 -l1 |= 0 
2 1 1 -1 2 
0 -1 0 6 
13. v= 3 24. cı 2 |+e2] 2 ]+c3] 1]=] 7 
—2 4 4 3 
-1 In Exercises 25—28, find all vectors | " | so that the 
14. v= 0 : 
1 vector equation can be solved. 
2 
1 2 a 
In Exercises 15 and 16, find w such that 25. caf a Ce a |= b 
—u+3v—2w=0. 
1 -1 
1 =) 26.0] | | al els. 
15.u= | 4 v= 2 
2 0 
rala AAR 
—2 2 
16. u = 0 v=] -3 
3 6 
1 4 afi }+als]=|¢ | 


In Exercises 17—24, write the vector equation as an 


equivalent linear system and then solve the system. A and all veetors 


Explain what the solution to the linear system implies a 
about the vector equation. v=| b 
c 
1 3 —2 ‘ 
17. cı 2 +o sal S a so that the vector equation civ] + c2V2 + c3V3 = V can 
be solved. 


29. 


30. 


31. 


v= 


V3 


v= 


V3 = 


vV = 


V3 = 


| Cee 0) cen 0 es: (20) Ce | cs |e es | 


= o m. 


or 


V = 


V = 


2.2 > 


2.2 Linear Combinations 101 


-1 1 
32. vi = 0 v= | -2 
1 2 8 


1 
V3 = —1 
3 


In Exercises 33—39, verify the indicated vector 
property of Theorem 1 for vectors in R”. 


= © 


O = © 


33. Property 2. 
34. Property 3. 
35. Property 4. 
36. Property 5. 


Ja 


37. Property 6. 
38. Property 7. 
39. Property 8. 


40. Prove that the zero vector in R” is unique. 


Linear Combinations 


In three-dimensional Euclidean space R? the vectors that define the three coordinate 
axes are the vectors 


1 0 0 
e, = | 0 e=| 1 and e3 = 
0 0 1 


Every vector in R? can then be obtained from these three vectors. For example, the 
vector 


2 1 0 0 
v=; 3 |} =2; 0/+3] 1 ]+34 0 
3 0 0 1 


Geometrically, the vector v is obtained by adding scalar multiples of the above 
vectors, as shown in Fig. 1. The vectors e1, e2, and e3 are not unique in this respect. 
For example, the vector v can also be written as a combination of the vectors 


102 


Chapter 2 


Linear Combinations and Linear Independence 


\ 


a 
oN 
yr 


x 2e; + 6° y 
Figure 1 
that is, 
2 
3v1 — v2 + V3 = 3 
3 


A vector written as a combination of other vectors using addition and scalar mul- 
tiplication is called a linear combination. Combining vectors in this manner plays a 
central role in describing Euclidean spaces and, as we will see in Chap. 3, in describing 
abstract vector spaces. 


DEFINITION 1 Linear Combination Let S = {vj, v2,..., Vg} be a set of vectors in R”, and 


let c1, C2,..., cx be scalars. An expression of the form 


k 
CIV] + C2V2 ++ + CVE = 5 CiVi 
i=l 
is called a linear combination of the vectors of S. Any vector v that can be written 
in this form is also called a linear combination of the vectors of S. 


In Example 1 we show how linear systems are used to decide if a vector is a 
linear combination of a set of vectors. 


| EXAMPLE1 | Determine whether the vector i 


v= 1 
10 
is a linear combination of the vectors 
1 —2 —6 
vi = || © v= 3 and V3 = 7 


2.2 Linear Combinations 103 


Solution The vector v is a linear combination of the vectors vı, v2, and v3 if there are scalars 
C1, C2, and c3, such that 


—1 
y= 1 = CIM + C2V2 + €3V3 
10 
1 —2 —6 
=c; | 0 | +c 3 | +63 7 
1 —2 5 
(Cj = 2c? == 6c3 
= 3c2 + 1703 
c1 — 2c2 + 5c3 
Equating components gives the linear system 
C1 — 2c — 6c3 Z] 
30. +7¢c3= 1 


cı — 2c +5c3 = 10 


To solve this linear system, we reduce the augmented matrix 


1 -2 -6]| -1 1 0 0 1 
0 3 7 1 to 0 1 0} -2 
1 -2 5| 10 0 0 1 1 
From the last matrix, we see that the linear system is consistent with the unique 
solution 
er = Il o SE and ea S 


Using these scalars, we can write v as the linear combination 


—1 1 =2 —6 
y= MESI @ || 2) en 7 
10 1 =} 5 


The case for which a vector is not a linear combination of a set of vectors is 
illustrated in Example 2. 


| EXAMPLE2 | Determine whether the vector - 


YS 11 
—7 
is a linear combination of the vectors 
1 0 2 
v= | -2 Yal s and Y= © 


104 


Chapter 2 Linear Combinations and Linear Independence 


Solution 


The vector v is a linear combination of the vectors v;, v2, and v3 if there are scalars 
C1, C2, and c3, such that 


=9 1 0 2 
11 = —2 + c2 5 SP ©3 0 
-7 2 5 8 
The augmented matrix corresponding to this equation is given by 
O Z =S 
=2 » W| iil 
mm s) | =7 
Reducing the augmented matrix 
i O 2/ = i 0) 2| = 
È  @|| til to 0 5 4 1 
2 3 % | =7 0 0 0 2 


shows that the linear system is inconsistent. Therefore, the vector v cannot be 
written as a linear combination of the three vectors v1, v2, and v3. 
To see this geometrically, first observe that the vector v3 is a linear combination 
of vı and v2. That is, A 
v3 = 2v,; + z2 


Therefore, any linear combination of the three vectors v1, v2, and v3, is just a linear 
combination of v; and v2. Specifically, 


4 
C1V1 + C2V2 + €3V3 = C1V1 + C2V2 + C3 (2m JF =) 


4 
= (cı + 2c3)vı + (« + sc) V2 


The set of all vectors that are linear combinations of vı and v2 is a plane in R3, 
which does not contain the vector v, as shown in Fig. 2. 


Figure 2 


Solution 


2.2 Linear Combinations 105 


Analogous to R?, in R” are the n special vectors given by 


1 0 0 
0 1 0 
e = g € = : oe € = : 
0 0 1 


These vectors can also be defined by the equations 
1 ifi=k 
(ees af 0 ifit#k 
where 1 <k <n. 
An important property of these special vectors is that every vector in R” can be 


written as a linear combination of these vectors. Indeed, for any vector v in R”, let 
the scalars be the components of the vector, so that 


V1 1 0 0 
v2 0 1 0 
v= : = v1 K + v2 j tees Un : 
Un 0 0 1 


= Ve] + U2 +--+ + Vren 
Linear combinations of more abstract objects can also be formed, as illustrated 
in Example 3 using 2 x 2 matrices. This type of construction is used extensively in 
Chap. 3 when we consider abstract vector spaces. 


Show that the matrix 
i 1 1 
m o 


is a linear combination of the matrices 


Similar to the situation with vectors, we must find scalars c1, c2, and c3 such that 


cıMı + c2M2 + c3M3 = A 


1 0 0 1 1 1 ioi 
afo leafa cia r a) ial 


After performing the scalar multiplication and addition, we obtain 


ci + C3 are; | || i il 
cote; cr tate} | 1 0 


that is, 


106 Chapter 2 Linear Combinations and Linear Independence 


Equating corresponding entries gives the linear system 
Gil ar c3 = 1 
ata =1 
ca+teo+e =0 


This system is consistent with solution cı = —1, c2 = —1, and c3 = 2. Thus, the 
matrix A is a linear combination of the matrices M1, M2, and M3. 


| EXAMPLE4 | Consider the homogeneous equation Ax = 0. Show that if x1, X2, . . ., x, are solu- 


tions of the equation, then every linear combination c;x; + C2X2 +---+ Cn Xn IS 
also a solution of the equation. 
Solution Since x1, X2,..., Xn are solutions of the matrix equation, we have 


Ax; =0 Ax, = 0 gia Ax, =0 


Then using the algebraic properties of matrices, we have 


A(c1X1 + €2X2 + +++ + CnXn) = AXD + A 
= C1 (Ax) + c2(Ax2) +--+ + Cn (AX) 
=cj04+c0+---+c,0 
=0 


The result of Example 4 is an extension of the one given in Example 3 of 
Sec. 1.5. 


Vector Form of a Linear System 
We have already seen that a linear system with m equations and n variables 
aix + a12X2 + +++ + AinXn = bi 


a2, X1 + an2xX2 + +++ + amnXn = b2 


Am1X1 + Gm2X2 + +++ + GnnXn = bm 


can be written in matrix form as Ax = b, where A is the m x n coefficient matrix, x 
is the vector in R” of variables, and b is the vector in R” of constants. If we use the 
column vectors of the coefficient matrix A, then the matrix equation can be written 


THEOREM 2 


2.2 Linear Combinations 107 


in the equivalent form 


ai a2 din bi 

a21 an An bz 
X1 g + x2 : tes + +Xy : = : 

Am1 Am2 Amn bin 


This last equation is called the vector form of a linear system. This equation can also 
be written as 

XIA + X242 +-+-+2%,An = b 
where A; denotes the ith column vector of the matrix A. Observe that this equation is 
consistent whenever the vector b can be written as a linear combination of the column 
vectors of A. 


The linear system Ax = b is consistent if and only if the vector b can be expressed 
as a linear combination of the column vectors of A. 


Matrix Multiplication 


Before concluding this section, we comment on how linear combinations can be used 
to describe the product of two matrices. Let A be an m x n matrix and B an n x p 
matrix. If B; is the ith column vector of B, then the ith column vector of the product 
AB is given by 


ail A2 ... Ain bii 
a2) A2 ... An bzi 
AB; = . 

Aml Am2 .-.. Amn bni 

ayy by + anbi ++: + dinbni 

abii + anbi + +++ + anbni 

amıbii + Am2b2i a E AmnDni 

ayy by; a12b2; Anni 
abii anbi anbni 
m0}; am2b2i Amnbni 

= bii Ay + bzn A2 +--+ + bni An 
Since fori = 1,2,..., p the product AB; is the ith column vector of AB, each column 


vector of AB is a linear combination of the column vectors of A. 


108 Chapter 2 Linear Combinations and Linear Independence 


Fact Summary 


. Every vector in R” is a linear combination of the vectors e;, €2,..., €n- 
. If x1, X2,..., xx are all solutions to the homogeneous equation Ax = 0, 
then so is every linear combination of these vectors. 


. The linear system Ax = b can be written in the equivalent vector form as 
xyA; + x2A. +---+x,A, = b. The left side is a linear combination of the 
column vectors of A. 


. The linear system Ax = b is consistent if and only if b is a linear 
combination of the column vectors of A. 


Exercise Set 2.2 | -- '__ 


In Exercises 1—6, determine whether the vector v is a =3 = 
linear combination of the vectors vı and v2. 5. v= 10 v = 3 
10 4 
eefi] [i 
i 11 1 1 
w= 4 


—2 3 
E n=[73 | 6. v= 6 v= 
~ | -2 = 8 -1 
3 2 
m=[9] n=l? 
3 
1 = 
3. v= | 1 | y= | 4 | In Exercises 7—12, determine whether the vector v is a 
linear combination of the vectors vı, v2, and v3. 
3 
y= | = | 2 2 
Te y= 8 vV = —2 
2 0 
a Sai 
3 —2 
i v= 0 V3 = 0 
ge. i S 


10. 


11. 


12. 


5 
v= | —4 
-7 
—2 
v= —1 
—1 
—1 
v= 1 
5 
—1 
v= —1 
3 
—3 
v= 5 
5 
1 
V = 4 
1 
3 
_ | —17 
SS 7 
7 
1 
| 6 
V = —1 
2 
6 
aes |e 
~ 1 3 
7 
1 
ot beh 
w= 2 
3 


v = -l 
3 
V3 = —1 
—3 
1 
Y= 2 
-l 
0 
V3 = 1 
2 
—3 
vV = 2 
1 
—1 
v3 = 10 
3 
2 
—3 
y= 4 
1 
—1 
—1 
V3 = 2 
3 
2 
3 
v= 4 
5 
3 
1 
V3 = za 
1 


2.2 Linear Combinations 109 


In Exercises 13—16, find all the ways that v can be 
written as a linear combination of the given vectors. 


evf] vE] 
[4] [i 
w[i] Ll 


—2 3 
V2 = | V3 = 0 
0 0 
15. v= | —1 v=] 1 
—3 1 
—2 —2 
V2 = -l V3 = —3 
2 —1 
2 
V4 = —1 
—2 
~3 -1 
16. v= | —3 vi=|-l 
1 2 
0 0 
v= | -l v= | —1 
—1 —2 
—3 
V4 = —1 
—2 


In Exercises 17—20, determine if the matrix M is a 
linear combination of the matrices Mı, M2, and M3. 
—2 4 | 


7. m= | 4 0 


19. 


20. 


21. 


22. 


23. 


24, 


Chapter 2 Linear Combinations and Linear Independence 


ee 

o 
we 

taj a 
T 
"a 

an 
TEE 
ele) 

tea=| 3 1 [erax= | _7 |. Write the 


product Ax as a linear combination of the column 
vectors of A. 


12 -1 —1 
Let A = 2 3 4 and x= | —1 
—3 2 1 3 


Write the product Ax as a linear combination of 
the column vectors of A. 
-1 -—2 3 2 . 
3 ICID 5 | wre 
each column vector of AB as a linear 
combination of the column vectors of A. 


Let A=| 


2 0 —1 
Let A = 1 -l 4 and 
—4 3 1 
3 2 1 
B= | -—2 1 0 |. Write each column 
2 -1 1 


vector of AB as a linear combination of the 
column vectors of A. 


In Exercises 25 and 26, write the polynomial p(x), if 
possible, as a linear combination of the polynomials 


25. 
26. 


and x? 


l+x 
p(x) = 2x* —3x — 1 


p(x) = —x? +3x +3 


In Exercises 27 and 28, write the polynomial p(x), if 
possible, as a linear combination of the polynomials 


27. 
28. 
29. 


30. 


31. 


32. 


33. 


34. 


35. 


1+x,—x, x? +1 and 2x? —x+1 
p(x) =x? —2x+1 
pa) =x? 


Describe all vectors in R? that can be written as a 
linear combination of the vectors 


1 3 1 
2 7 and 3 
—1 —2 0 


Describe all 2 x 2 matrices that can be written as 
a linear combination of the matrices 
0 0 
0 1 


1 0 0 1 
0 0 1 0 
If v = vı + v2 + V3 + v4 and 


v4 = Vj — 2v2 + 3v3, write v as a linear 
combination of v1, v2, and v3. 


and 


If v = vi + v2 + V3 + v4 and v2 = 2v; — 4v3, 
write v as a linear combination of v1, v3, and v4. 


Suppose that the vector v is a linear combination 
of the vectors Vj, V2,...,V,, and 

C1V1 + CoV2 +--+ + CnVn = 0, with cı 4 0. Show 
that v is a linear combination of v2, ..., Vn. 


Suppose that the vector v is a linear combination 
of the vectors v1, V2,..., Vn, and W1, W,..., Wm, 
are another m vectors. Show that v is a linear 
combination of v1, V2,..., Vn, W1,W2,---, Wm- 


Let S be the set of all linear combinations of the 
vectors V1, V2,..., Vx in R”, and S> be the set of 
all linear combinations of the vectors V1, v2, ..., 


36. 


37. 


2.3 Linear Independence 111 


Vx, CV;, Where c is a nonzero scalar. Show that 38. Suppose that Ax = b is a 3 x 3 linear system that 
Si = So. 


is consistent. If A3 = A; + A2, then show that the 
linear system has infinitely many solutions. 


Let Sı be the set of all linear combinations of the 


vectors Vj, V2,.. 


., Vz in R”, and S2 be the set of 39. The equation 


all linear combinations of the vectors vj, V2,..., 2y" —3 y+y=0 
Vk, V1 + V2. Show that $; = Sp. f 


is an example of a differential equation. Show that 


Suppose that Ax = b is a 3 x 3 linear system that y = f(x) =e* and y = g(x) = e7* are solutions 
is consistent. If there is a scalar c such that to the equation. Then show that any linear 

A3 = cA], then show that the linear system has combination of f(x) and g(x) is another solution 
infinitely many solutions. to the differential equation. 


2.3 > Linear Independence 


4y 


v 
zl 
u= 5V 
> 
x 
—v 
Figure 1 
AY 
v 
u 

> 
x 

Figure 2 


DEFINITION 1 


In Sec. 2.2 we saw that given a set S of vectors in R”, it is not always possible to 
express every vector in R” as a linear combination of vectors from S. At the other 
extreme, there are infinitely many different subsets S such that the collection of 
all linear combinations of vectors from S$ is R”. For example, the collection of 
all linear combinations of the set of vectors S = {e;,..., €n} is R”, but so is the 
collection of linear combinations of T= {e1,..., €n, €1 + €2}. In this way S and 
T both generate R”. To characterize those minimal sets S that generate R”, we 
require the concept of linear independence. As motivation let two vectors u and 
v in R? lie on the same line, as shown in Fig. 1. Thus, there is a nonzero scalar c 
such that 


This condition can also be written as 


u—cv=0 


In this case we say that the vectors u and v are linearly dependent. Evidently we 
have that two vectors u and v are linearly dependent provided that the zero vector is 
a nontrivial (not both scalars 0) linear combination of the vectors. On the other hand, 
the vectors shown in Fig. 2 are not linearly dependent. This concept is generalized to 
sets of vectors in R”. 


Linearly Indpendent and Linearly Dependent The set of vectors S = 
{v|, V2,---, Vm} in R” is linearly independent provided that the only solution to 
the equation 

C1V1 + C2V2 + +++ + CmYm = 0 


is the trivial solution cj = c2 =--- = Cm = Q. If the above linear combination has 
a nontrivial solution, then the set S is called linearly dependent. 


112 Chapter 2 Linear Combinations and Linear Independence 


For example, the set of vectors 


S = {e1 a i en} 
in R” is linearly independent. 
| EXAMPLE1 | Determine whether the vectors 
1 (0) 1 
RO _| 2 d _ | i 
Ma | v2 an vs a 
p 2 3 


are linearly independent or linearly dependent. 


Solution We seek solutions to the vector equation 


1 0 1 0 
0 1 1} | 0 
C1 1 ae C2 1 SF C3 1 a 0 
2 2 3 0 
From this we obtain the linear system 
c+ c3=0 
c2a+ c3=0 


cyt at c3=0 
2c, +2c2 + 3c3 = 0 
Subtracting the first equation from the third equation gives c2 = 0. Then, from 
equation 2, we have c3 = 0 and from equation 1 we have cı = 0. Hence, the only 
solution to the linear system is the trivial solution cı = c2 = c3 = 0. Therefore, the 
vectors are linearly independent. 


| EXAMPLE2 | Determine whether the vectors 


1 —1 —2 2 
v=] 0 y= 1 v3 = 3 w= | 1 
2 2 1 1 


are linearly independent. 


Solution As in Example 1, we need to solve 


1 —1 —2 2) 0 
Cl 0 + c2 1 + 63 3 + c4 1 = 0 


THEOREM 3 


2.3 Linear Independence 113 


This leads to the homogeneous linear system 
Ci) — c — 2c3 +2c4= 0 
c2 +3c3+ c4=0 
Ye nr Aaa Gr Ca = 


with solution set given by 
S = {(—2r, 2t, —t,t) |t € R} 


Since the linear system has infinitely many solutions, the set of vectors 
{v1, V2, V3, V4} is linearly dependent. 


In Example 2, we verified that the set of vectors {v1, V2, V3, V4} is linearly depen- 
dent. Observe further that the vector v4 is a linear combination of v1, v2, and v3, 
that is, 


v4 = 2v; — 2v2 + V3 
In Theorem 3 we establish that any finite collection of vectors in R”, where the 
number of vectors exceeds n, is linearly dependent. 


Let S = {v1, Vo,..., Vn} be a set of n vectors in R”. If n > m, then the set S is 
linearly dependent. 


Proof Let A be the m x n matrix with column vectors the vectors of S so that 
Aj = V; for i=1,2,...,n 
In this way we have 


C1V1 + Cova +++ + Cnn = 0 
in matrix form, is the homogeneous linear system 
ka 
C2 


Ac=0 where =| | 
Cn 


As A is not square with n > m, there is at least one free variable. Thus, the solution 
is not unique and S = {v1, ..., Vn} is linearly dependent. 


Notice that from Theorem 3, any set of three or more vectors in R2, four or 
more vectors in R?, five or more vectors in R4, and so on, is linearly dependent. This 
theorem does not address the case for which n < m. In this case, a set of n vectors 
in R” may be either linearly independent or linearly dependent. 

The notions of linear independence and dependence can be generalized to include 
other objects, as illustrated in Example 3. 


114 Chapter 2 Linear Combinations and Linear Independence 


Solution 


THEOREM 4 


Determine whether the matrices 
1 0 -1 2 5 —6 
“=| aa =| | and m=| 3 E) 
are linearly independent. 


Solving the equation 
ala |tel s ai a 
is equivalent to solving 
| Cy — c +53 ees |= | a 
3c, + 3c2 —3c3 2c, + 2c2 — 203 0 0 
Equating corresponding entries gives the linear system 


cı— c2+5c3=0 

2c2 — 6c3 = 0 
3c; + 3c2 — 3c3 = 0 
2c, + 2co — 2c3 = 0 


The augmented matrix of the linear system is 


1 -1l 5|0 1 0 2/10 
0 2 -6)|0 ; 0 1 -—3/0 
3 3 3l0 which reduces to 0 0 olo 
2 2 —2]|0 0 0 010 


Therefore, the solution set is 
S = {(—2t,3t,t) | t € R} 


Since the original equation has infinitely many solutions, the matrices are linearly 
dependent. 


Criteria to determine if a set of vectors is linearly independent or dependent are 
extremely useful. The next several theorems give situations where such a determination 
can be made. 


If a set of vectors S = {v1, V2,..., Vn} contains the zero vector, then S is linearly 
dependent. 


Proof Suppose that the vector v; = 0, for some index k, with 1 < k < n. Setting 
Cy = Cg = = Cg_-y = 0, ck = 1, and Ck41 = Ck42 = +--+ = Cy = 0, we have 


Ovi +- -- + Ovg_1 + Ive + Ove41 +--+ 0v, = 0 


which shows that the set of vectors is linearly dependent. 


2.3 Linear Independence 115 


THEOREM 5 A set of nonzero vectors is linearly dependent if and only if at least one of the 
vectors is a linear combination of other vectors in the set. 


Proof Let S = {v1, V2, ..., Vn} be a set of nonzero vectors that is linearly depen- 
dent. Then there are scalars cj, c2, ..., Cn, not all O, with 


C1V1 + C2V2 +++ + Cnn = 0 


Suppose that cg # 0 for some index k. Then solving the previous equation for the 
vector vg, we have 
C1 Ck—1 Ck+1 Cn 
Vk = TV V1 — Vk S e Vn 
Ck Ck Ck Ck 


Conversely, let vg be such that 
Ve = CV) + C2V2 ++ + CK-1VR-1 F Ckik Ht + Cnn 
Then 
CLV + C2V2 +s F Ck—1Vk-1 — Vk F Ck4AVk4 ee + Cnn = 0 


Since the coefficient of v is —1, the linear system has a nontrivial solution. Hence, 
the set S is linearly dependent. 


As an illustration, let S be the set of vectors 


1 —1 2 
s= 3 i; 2 kh6 
1 1 2 
Notice that the third vector is twice the first vector, that is, 
2 1 
6 p=2]-3 
2 1 


Thus, by Theorem 5, the set S is linearly dependent. 


| EXAMPLE4 | Verify that the vectors 


[4] w[i] om vail 


are linearly dependent. Then show that not every vector in R? can be written as a 
linear combination of the others. 


Solution By Theorem 3, any three vectors in R? are linearly dependent. Now, observe that 
vı and v3 are linear combinations of the other two vectors, that is, 


vı = 0v2 — v3 and v3 = 0v2 — v1 


116 


Vi 


Chapter 2 Linear Combinations and Linear Independence 


v2 


V3 


Figure 3 


THEOREM 6 


However, v) cannot be written as a linear combination of vı and v3. To see this, 
notice that the equation 
avı + bv3 = v2 


is equivalent to the inconsistent linear system 


As shown in Fig. 3, any linear combination of the vectors vı and v3 is a vector that 
is along the x axis. Therefore, v2 is not a linear combination of vı and v3. 


1. If a set of vectors S is linearly independent, then any subset of S is also a 
linearly independent set of vectors. 

2. If a set of vectors T is linearly dependent and S is a set of vectors that contains 
T, then S is also a linearly dependent set of vectors. 


Proof (1) Let T be a subset of S. Reorder and relabel the vectors of S, if 
necessary, so that T = {vj,..., Vg} and S = {vj,..., Vk, Vk+1> ---» Vm}. Consider 
the equation 

civ) + C2V2 +--+ + ck = 0 
Next let Ck}1 = Ce42 = +--+ = Cm = 0, and consider the linear combination 


C1V1 + 2V2 +--+ + CKVE + OVE41 +--+ OYn = 0 


Since S is linearly independent, cı = c2 = --- = ck = 0 and hence T is linearly 
independent. 

(2) Let T = {v,,..., Vg} and suppose that T C S. Label the vectors of S that are 
not in T as V41, -. -, Vm. Since T is linearly dependent, there are scalars c1, . . . , Ck, 


not all 0, such that 
C1V1 + C2V2 +--+ ckv = 0 


Then c1, €2,..., Ck, Ck+1 = Ck42 = *** = Cm = 0 is a collection of m scalars, not 
all 0, with 
C1V1 + C2V2 + +e + CKVE + OVEG1 +--+ OYn = 0 


Consequently, S is linearly dependent. 


Given a set of vectors S$ = {v},..., Vn} and an arbitrary vector not in S, we have 
seen that it may or may not be possible to write v as a linear combination of S. We 
have also seen that sometimes v can be written as a linear combination of the vectors 
of S in infinitely many ways. That this cannot happen for a linearly independent set 


is the content of Theorem 7. 


THEOREM 7 


THEOREM 8 


2.3 Linear Independence 117 


Let S = {v1, V2,..., Vn} be a linearly independent set. Suppose that there are scalars 


C1, C2, ..., Cn Such that 
n 
v= ) CkVK 
k=1 


Then the scalars are unique. 


Proof To prove the result, let v be written as 


n n 
v= > CKVK and as v= > AV 
k=1 k=1 


Then 


ies oe 
k=1 k=1 
= S (ck - divi 
k=1 


Since the set of vectors S is linearly independent, the only solution to this last 
equation is the trivial one. That is, 


cı — di = 0,c2 — dz = 0,..., Cn — dn = 0, or cı = di, c&2 = d2,..., Cn = dn 


Linear Systems 


At the end of Sec. 2.2, in Theorem 2, we made the observation that a linear sys- 
tem Ax = b is consistent if and only if the vector b is a linear combination of the 
column vectors of the matrix A. Theorem 8 gives criteria for when the solution is 
unique. 


Let Ax = b be a consistent m x n linear system. The solution is unique if and only 
if the column vectors of A are linearly independent. 


Proof First we prove that the condition is necessary. Suppose that the column 


vectors A1, A2, ..., A, are linearly independent, and let 
Cc} dı 
c d 
c= ? and d= F 


Cn dn 


118 Chapter 2 Linear Combinations and Linear Independence 


Solution 


be solutions to the linear system. In vector form, we have 
c1Ay + coA2 +--+ +C,An =b and diAı + dyAo.+---+d,An = b 


By Theorem 7, cı = d1, c2 = do, ..., Cn = dy. Hence, c = d and the solution to 
the linear system is unique. 

To prove the sufficiency, we will prove the contrapositive statement. Let v be a 
solution to the linear system Ax = b, and assume that the column vectors of A are 
linearly dependent. Then there are scalars c1, c2, ..., Cn, not all 0, such that 


C1] CA2 +---+c,A, = 0 


that is, if c = eb then Ac = 0. Since matrix multiplication satisfies the dis- 


c 
tributive property, í 

A(v+cec)=Av+Ac=b+0=b 
Therefore, the vector v + ¢ is another solution to the linear system, and the solution 
is not unique. This completes the proof of the contrapositive statement. Therefore, 
we have shown that if the solution is unique, then the column vectors of A are 
linearly independent. 


Theorem 8 provides another way of establishing Theorem 11 of Sec. 1.5 that a 


linear system has no solutions, one solution, or infinitely many solutions. 


Linear Independence and Determinants 


In Chap. 1, we established that a square matrix A is invertible if and only if det(A) 4 0. 
(See Theorem 16 of Sec. 1.6.) This is equivalent to the statement that the linear 
system Ax = b has a unique solution for every b if and only if det(A) #0. This 
gives an alternative method for showing that a set of vectors is linearly independent. 
Specifically, if A is a square matrix, then by Theorem 8, the column vectors of A are 


linearly independent if and only if det(A) ¥ 0. 


Let 
1 1 1 
S = 0 ’ 2 ’ 4 
3 4 5 


Determine whether the set S is linearly independent. 


Let A be the matrix whose column vectors are the vectors of S; that is, 


oi it 
A= | O 2 2 
3 AO 


THEOREM 9 


2.3 Linear Independence 119 


The determinant of A can be found by expanding along the first column, so that 
2 4 i 1 i il 
4 5 4 5 2 4 
= —6—0+3(2) =0 


det(A) = l |-0 


E 


Therefore, by the previous remarks S is linearly dependent. 


The final theorem summarizes the connections that have thus far been established 
concerning solutions to a linear system, linear independence, invertibility of matrices, 
and determinants. 


Let A be a square matrix. Then the following statements are equivalent. 


1. The matrix A is invertible. 

. The linear system Ax = b has a unique solution for every vector b. 

. The homogeneous linear system Ax = 0 has only the trivial solution. 
. The matrix A is row equivalent to the identity matrix. 

. The determinant of the matrix A is nonzero. 


Ann AUN 


. The column vectors of A are linearly independent. 


Fact Summary 


Let S be a set of m vectors in R”. 


. If m >n, then S is linearly dependent. 

. If the zero vector is in S, then S is linearly dependent. 

. If u and v are in S and there is a scalar c such that u = cv, then S is 
linearly dependent. 

. If any vector in S is a linear combination of other vectors in S, then S is 
linearly dependent. 

. If S is linearly independent and T is a subset of S, then T is linearly 
independent. 

. If T is linearly dependent and T is a subset of S, then S is linearly 
dependent. 

. If S = {v1,..., Vm} is linearly independent and v = c1V1 +--+ + CmVm, 
then the set of scalars cj, ..., Cm is uniquely determined. 

. The linear system Ax = b has a unique solution if and only if the column 
vectors of A are linearly independent. 

. If A is a square matrix, then the column vectors of A are linearly 
independent if and only if det(A) 4 0. 


120 


Exercise Set 2.3 


Chapter 2 Linear Combinations and Linear Independence 


In Exercises 1—10, determine whether the given 


vectors are linearly independent. 


eefi] [i] 


V3 


V3 


—1 


=2 


3 1 
—1 0 
9, vV = =i w= 2 
2 1 
3 
-l 
V3 = 0 
1 
—2 3 
= —4 
10. vV = i V = 0 
1 4 
-l 
—12 
V3 = 2 
6 


In Exercises 11—14, determine whether the matrices 
are linearly independent. 


12. Mı = 


13. Mı = 


14. Mı = 


In Exercises 15—18, explain, without solving a linear 
system, why the set of vectors is linearly dependent. 


snf] 


=| =|] 
1 
»=[5] 
1 0 
17. v= | -6 |w=] 0 
2 
4 
V3 = 7 
1 
1 1 
18. v; = 0 |vw= | -2 
—2 1 
2 
V3 = —2 
-l 


In Exercises 19 and 20, explain, without solving a 


linear system, why the column vectors of the matrix A 


are linearly dependent. 


—1 2 5 
19. a. A= 3 —6 3 
2 4 3 
2 V3 
b. A= 1 0 1 
-1 1 0 


| 

= 

| 

— 
Nore WW 


21. 


22. 


23. 


24. 


2.3 Linear Independence 121 


Determine the values of a such that the vectors 


1 —1 2 
2 0 a 
1 1 4 


are linearly independent. 


Determine the values of a such that the matrices 


ea ah ees 


are linearly independent. 


Let 
1 1 
v = 1 v=]|2 V3 = 
1 3 


a. Show that the vectors are linearly independent. 


b. Find the unique scalars c1, c2, c3 such that the 


vector 
2 


v= 1 
3 
can be written as 


vV = C1V1 + €2V2 + €3V3 


Let 


a. Show that the matrices are linearly 
independent. 


b. Find the unique scalars c1, c2, c3 such that the 


matrix 
3 5 
“=|; j 


can be written as 


M = cıMı + c2 M2 + c3 M3 


122 Chapter 2 Linear Combinations and Linear Independence 


c. Show that the matrix 
0 3 
alee) 
cannot be written as a linear combination of 
Mı, Mn, and M3. 
In Exercises 25 and 26, for the given matrix A 


determine if the linear system Ax = b has a unique 
solution. 


1 2 0 

25. A= | —-1 0 3 

2 1 2 
3 2 4 
26.A=]1 -1 4 
0 2 —-4 


In Exercises 27—30, determine whether the set of 
polynomials is linearly independent or linearly 
dependent. A set of polynomials 

S = {pi (x), p2(x),.--, Pn(x)} is linearly independent 
provided 


C1 pi (X) + 2 p2(x) + +++ + CnPn(x) = 0 


for all x implies that 


Cjp=Q=:::=c, =90 


27. pi(x) = 1 p(x) = —2 + 4x? 
p3(x) = 2x pa(x) = —12x + 8x7 
28. pi(x) = 1 po(x) =x 
p3(x) = 5 + 2x — x? 
29. pi(x) = 2 p(x) = x p(x) = x? 
pa(x) =3x —1 
30. pi(x) = x? — 2x? + 1 po(x) = 5x 
p3(x) = x? — 4 pa(x) = x? + 2x 
In Exercises 31—34, show that the set of functions is 
linearly independent on the interval [0, 1]. A set of 


functions S = {fi (x), fo(x), ..-, fx(x)} is linearly 
independent on the interval [a, b] provided 


ci fix) + c2 fo(x) + +++ + en fax) = 0 
for all x € [a, b] implies that 


Clp=Q=::'=C, =0 


31. 
32. 


33. 
34. 


35. 


36. 


37. 


38. 


39. 


40. 


fi(x) = cos mx fo(x) = sin mx 


fi) =e fha) =e 
f(x) = e” 


fia) =x ha) = x2 Ao Se 
fi@) =x hx) = e 


f(x) = sin nx 


Verify that two vectors u and v in R” are linearly 
dependent if and only if one is a scalar multiple 
of the other. 


Suppose that S = {v1, V2, v3} is linearly 
independent and 


Wi = Vi + V2 + V3 w2 = V2 + V3 


and 

W3 = V3 
Show that T = {w1, W2, w3} is linearly 
independent. 


Suppose that S = {v1, V2, v3} is linearly 
independent and 


w =V +V W2 = V2 — V3 


and 

W3 = V2 + V3 
Show that T = {w1, W2, w3} is linearly 
independent. 


Suppose that S = {v 1, V2, v3} is linearly 
independent and 


Wi1=V2 w2 = Vi + V3 


and 

W3 = Vi + V2 + V3 
Determine whether the set T = {w1, W2, W3} is 
linearly independent or linearly dependent. 


Suppose that the set S$ = {v1, v2} is linearly 
independent. Show that if v3 cannot be written as 
a linear combination of vı and v2, then 

{V1, V2, V3} is linearly independent. 


Let S = {v1, V2, V3}, where v3 = vj + v2. 
a. Write vı as a linear combination of the vectors 
in S in three different ways. 


b. Find all scalars c1, c2, and c3 such that 
Vi = C1V1 + C2V2 + €3V3. 


41. Show that if the column vectors of an m x n 
matrix A;,...,A, are linearly independent, 
then 

{x e R” | Ax = 0} = {0} 


2.3 Linear Independence 123 


42. Let vı, ..., Vg be linearly independent vectors in 


R”, and suppose A is an invertible n x n matrix. 
Define vectors w; = Av;, fori = 1,...,k. Show 
that the vectors w1, ..., Wx are linearly 
independent. Show, using a 2 x 2 matrix, that the 
requirement of invertibility is necessary. 


Review Exercises for Chapter 2 


1. If ad — bc £0, show that the vectors 


[es] om [a 


are linearly independent. Suppose that 
ad — bc = 0. What can you say about the two 
vectors? 


2. Suppose that S = {vj, v2, v3} is a linearly 
independent set of vectors in R”. Show that 
T = {v1, V2, Vi + V2 + V3} is also linearly 
independent. 


3. Determine for which nonzero values of a the 
vectors 


a? (0) 1 
a and 
1 2 1 


are linearly independent. 


4. Let 
2s—t 


s 
S= s,teR 
t 
s 


a. Find two vectors in R4 so that all vectors in S 
can be written as a linear combination of the 
two vectors. 


b. Are the vectors found in part (a) linearly 


independent? 
5. Let 
1 1 
v= 0 and w= 1 
2 1 


a. Is S = {vj, v2} linearly independent? 


a 
b. Find a vector | b 
c 


that cannot be written 


as a linear combination of vı and v2. 


c. Describe all vectors in R? that can be written 
as a linear combination of vı and v2. 


d. Let 


V3 = 0 
0 
Is T = {v1, V2, V3} linearly independent or 
linearly dependent? 
e. Describe all vectors in R? that can be written 
as a linear combination of vj, v2, and v3. 


6. Let 
2 
vV = —1 V = 1 
1 
0 —2 
v= | 2 and v4 = 2; 
1 1 


a. Show that S = {v1, V2, V3, v4} is linearly 
dependent. 

b. Show that T = {v1, V2, v3} is linearly 
independent. 

c. Show that v4 can be written as a linear 
combination of v1, V2, and v3. 


Chapter 2 Linear Combinations and Linear Independence 


d. How does the set of all linear combinations of 


vectors in S compare with the set of all linear 
combinations of vectors in T? 


7. Consider the linear system 


t y+2z+ w= 3 
=x + z+2w= 1 
2x + 2y + w=-2 
x+ y+2z+3w= 5 


. Write the linear system in the matrix form 
Ax =b. 


. Find the determinant of the coefficient 
matrix A. 


. Are the column vectors of A linearly 
independent? 


. Without solving the linear system, determine 
whether it has a unique solution. 


. Solve the linear system. 


8. Let 


a. Show that the set {M,, M2, M3} is linearly 


independent. 


b. Find the unique scalars c1, c2, and c3 such that 


1 -l 
| 2 1 = cı Mı + c2M2 + 03M3 
1 -l 
1 2 


be written as a linear combination of Mı, M2, 
and M3? 


c. Can the matrix 


d. 


b 
d that can be 


written as a linear combination of Mı, Mo, 
and M3. 


: ; a 
Describe all matrices | : 


9. Let 


10. 


2 


me OW 


1 -l 


. Write the linear system Ax = b in vector 


form. 


. Compute det(A). What can you conclude as to 


whether the linear system is consistent or 
inconsistent? 


. Are the column vectors of A linearly 


independent? 


. Without solving the linear system, does the 


system have a unique solution? Give two 
reasons. 


Two vectors in R” are perpendicular provided 
their dot product is 0. Suppose S = {v1, V2,..., 
Vn} is a set of nonzero vectors which are pairwise 
perpendicular. Follow the steps to show S is 
linearly independent. 


a. 


b. 


Show that for any vector v the dot product 
satisfies v-v > 0. 

Show that for any vector v Æ 0 the dot product 
satisfies v-v > 0. 


. Show that for all vectors u, v, and w the dot 


product satisfies 


u-(v+w)=u-v+u-w 


. Consider the equation 


CyVvy + C2V2 +--- + Cn Vn = 0 


Use the dot product of v;, for each 1 <i <n, 
with the expression on the left of the previous 
equation to show that c; = 0, for each 
1<i<n. 


2.3 Linear Independence 125 


Chapter 2: Chapter Test 


In Exercises 1—33, determine whether the statement is 


true or false. 


1. Every vector in R? can be written as a linear 
combination of 


1 0 0 
0 1 0 
0 0 1 


2. Every 2 x 2 matrix can be written as a linear 


combination of 
1 0 0 1 0 0 
0 0 0 0 1 0 


3. Every 2 x 2 matrix can be written as a linear 
combination of 


1 2 
vV = V = 1 
1 0 

4 

V3 = 3 

-1 


4. The set S = {v1, V2, v3} is linearly independent. 


5. There are scalars cı and c2 so that 
V3 = C1 V1 + 2V2. 

6. The vector v can be written as a linear 
combination of vı and v3. 


7. The vector vı can be written as a linear 
combination of v2 and v3. 


8. 


10. 


If vj, v2, and v3 are the column vectors of a 3 x 3 
matrix A, then the linear system Ax = b has a 
unique solution for all vectors b in R>. 


. The polynomial p(x) = 3 + x can be written as a 


linear combination of qı (x) = 1+ x and 


qz(x) = 1 —x—x?, 


The set 
1 
S= —1 |,} 0], 
1 
2 0 
—2 |,| 1 
6 0 


is linearly independent. 


In Exercises 11—14, use the matrices 


11. 


12. 


14. 


The set S = {M,, M2, M3, M4} is linearly 
independent. 


The set T = {M1, M2, M3} is linearly 
independent. 


. The set of all linear combinations of matrices in S 


is equal to the set of all linear combinations of 
matrices in T. 


Every matrix that can be written as a linear 
combination of the matrices in T has the 
form 


126 


15. The vectors 


sS 1 0 
0 sS 1 
0 1 RY 


are linearly independent if and only if s = 0 or 
s=l. 


In Exercises 16—19, use the vectors 


ea 


16. The set S = {v1, v2} is linearly independent. 


17. Every vector in R? can be written as a linear 
combination of vı and v2. 


18. If the column vectors of a matrix A are vı and v2, 
then det(A) = 0. 


19. If b is in R? and cyv; + c2v2 = b, then 


z]= 
c2 


where A is the 2 x 2 matrix with column vectors 
vı and vz. 


20. The column vectors of the matrix 
cos@ sinô 
—sinð cos@ 


are linearly independent. 


21. If vı and v2 are linearly independent vectors 
in R” and v3 cannot be written as a scalar 
multiple of vı, then v1, v2, and v3 are linearly 
independent. 


22. If S = {v1, V2,..., Vm} is a set of nonzero vectors 
in R” that are linearly dependent, then every 


23. 


24, 


25. 


26. 


27. 


28. 


29. 


30. 


31. 


32. 


33. 


Chapter 2 Linear Combinations and Linear Independence 


vector in S can be written as a linear combination 
of the others. 


If vı and vz are in R?, then the matrix with 
column vectors vj, V2, and vj + v2 has a nonzero 
determinant. 


If vı and vz are linearly independent, vı, v2, and 
vı + v2 are also linearly independent. 


If the set S contains the zero vector, then S is 
linearly dependent. 


The column vectors of an n x n invertible matrix 
can be linearly dependent. 


If A is an n x n matrix with linearly independent 
column vectors, then the row vectors of A are 
also linearly independent. 


If the row vectors of a nonsquare matrix are 
linearly independent, then the column vectors are 
also linearly independent. 


If v1, V2, V3, and v4 are in R4 and {v1, v2, v3} is 
linearly dependent, then {v 1, v2, v3, v4} is linearly 
dependent. 


If v1, V2, v3, and v4 are in R4 and {v1, vo, v3} is 
linearly independent, then {v1, V2, V3, v4} is 
linearly independent. 


If v1, V2, V3, and v4 are in R4 and {vj, vo, V3, V4} 
is linearly independent, then {v1, v2, v3} is linearly 
independent. 


If v1, V2, V3, and v4 are in R4 and {vy, v2, V3, V4} 
is linearly dependent, then {v1, v2, v3} is linearly 
dependent. 


If S = {v1}, V2, ..., Vs} is a subset of R4, then S is 
linearly dependent. 


Vector Spaces 


3.1 Definition of a Vector Space 129 

3.2 Subspaces 140 

3.3 Basis and Dimension 156 

3.4 Coordinates and Change of Basis 173 
3.5 Application: Differential Equations 185 


hen a digital signal is sent through space 

(sometimes across millions of miles), 
errors in the signal are bound to occur. In 
response to the need for reliable information, 
mathematicians and scientists from a variety 
of disciplines have developed ways to improve 
the quality of these transmissions. One obvious 
method is to send messages repeatedly to increase 
the likelihood of receiving them correctly. This, 
however, is time-consuming and limits the num- 
ber of messages that can be sent. An innovative 
methodology developed by Richard Hamming in 
1947 involves embedding in the transmission a 
means for error detection and self-correction. One 
of Hamming’s coding schemes, known as Ham- 
ming’s (7,4) code, uses binary vectors (vectors 
consisting of 1s and 0s) with seven components. 
Some of these vectors are identified as codewords 
depending on the configuration of the 1s and Os within it. To decide if the binary 
vector 


© Brand X Pictures/PunchStock/RF 


127 


128 Chapter 3 Vector Spaces 


is a codeword, a test using matrix multiplication is performed. The matrix given by 


1 1101 0 0 
c=;0 1 1 1021 0 
1 O 1 1 0 0 1 


is called the check matrix. To carry out the test, we compute the product of C and b, 
using modulo 2 arithmetic, where an even result corresponds to a 0 and an odd result 
corresponds to a 1. This product produces a binary vector with three components 
called the syndrome vector given by 


Cb =s 


A binary vector b is a codeword if the syndrome vector s = 0. Put another way, b 
is a codeword if it is a solution to the homogeneous equation Cb = 0 (mod 2). For 
example, the vector 


1 
1 
0 
u=| 0 
0 
1 
1 
is a codeword since 
1 
1 
1 1 10100 0 2 0 
Cu=|]0 1 1 10 10 0]=/2]/=]|0 | (mod 2) 
1 0 1 100 1 0 2 0 
1 
1 
whereas the vector 
1 
1 
1 3 1 
v=] 0 is not since Cv=|2 }=] 0 | (mod 2) 
0 2 0 
0 
0 


With this ingenious strategy the recipient of a legitimate codeword can safely assume 
that the vector is free from errors. On the other hand, if the vector received is not 
a codeword, an algorithm involving the syndrome vector can be applied to restore 
it to the original. In the previous example the fifth digit of v was altered during the 


DEFINITION 1 


3.1 Definition of a Vector Space 129 


transmission. The intended vector is given by 
1 

1 

1 

v= | 0 

1 

0 

0 


Hamming’s (7,4) code is classified as a linear code since the sum of any two code- 
words is also a codeword. To see this, observe that if u and v are codewords, then 
the sum u + v is also a codeword since 


C(u + v) =Cu+Cv=0+0=0 (mod 2) 


It also has the property that every codeword can be written as a linear combination 
of a few key codewords. 


In this chapter we will see how the set of all linear combinations of a set of vectors 
forms a vector space. The set of codewords in the chapter opener is an example. 


Definition of a Vector Space 


In Chap. 2 we defined a natural addition and scalar multiplication on vectors in 
R” as generalizations of the same operations on real numbers. With respect to these 
operations, we saw in Theorem | of Sec. 2.1 that sets of vectors satisfy many of 
the familiar algebraic properties enjoyed by numbers. In this section we use these 
properties as axioms to generalize the concept of a vector still further. In particular, 
we consider as vectors any class of objects with definitions for addition and scalar 
multiplication that satisfy the properties of this theorem. In this way our new concept 
of a vector will include vectors in R” but many new kinds as well. 


Vector Space A set V is called a vector space over the real numbers provided 
that there are two operations—addition, denoted by 9, and scalar multiplication, 
denoted by ©—that satisfy all the following axioms. The axioms must hold for 
all vectors u, v, and w in V and all scalars c and d in R. 


1. The sum u 9 Vis in V. Closed under addition 
2u®@v=veu Addition is commutative 
3. UDV) P@w=ug (vew) Addition is associative 


130 Chapter 3 Vector Spaces 


Solution 


Solution 


4. There exists a vector 0 € V such that for Additive identity 
every vector u € V, 09u =u90 =u. 


5. For every vector u € V, there exists a vec- Additive inverse 
tor, denoted by —u, such that u ® (—u) = 
—u@u=0. 
6. The scalar product c © u is in V. Closed under scalar 
multiplication 


7. cO(U@v) =(cOu o(c Ov) 
8. (c+d)Ou=(cOu) (dOu) 
9. cO (d O©u)= (cdu 


10 1Ou=u 


In this section (and elsewhere when necessary) we use the special symbols © 
and © of the previous definition to distinguish vector addition and scalar multi- 
plication from ordinary addition and multiplication of real numbers. We also will 
point out that for general vector spaces the set of scalars can be chosen from any 
field. In this text, unless otherwise stated, we chose scalars from the set of real 
numbers. 


Euclidean Vector Spaces The set V = R” with the standard operations of 
addition and scalar multiplication is a vector space. 


Axioms 2 through 5 and 7 through 10 are shown to hold in Theorem 1 of Sec. 2.1. 
The fact that R” is closed under addition and scalar multiplication is a direct con- 
sequence of how these operations are defined. The Euclidean vector spaces R” 
are the prototypical vector spaces on which the general theory of vector spaces is 
built. 


Vector Spaces of Matrices Show that the set V = Mmxn of all m x n matrices 
is a vector space over the scalar field R, with © and © defined componentwise. 


Since addition of matrices is componentwise, the sum of two m x n matrices is 
another m x n matrix as is a scalar times an m x n matrix. Thus, the closure 
axioms (axioms | and 6) are satisfied. We also have that 1 © A = A. The other 
seven axioms are given in Theorem 4 of Sec. 1.3. 


Solution 


Solution 


3.1 Definition of a Vector Space 131 


When we are working with more abstract sets of objects, the operations of addition 
and scalar multiplication can be defined in nonstandard ways. The result is not always 
a vector space. This is illustrated in the next several examples. 


Let V = R. Define addition and scalar multiplication by 
a Ð b = 2a + 2b and kOa= ka 


Show that addition is commutative but not associative. 


Since the usual addition of real numbers (on the right-hand side) is commutative, 


a ®b = 2a + 2b 

= 2b + 2a 

=b@a 
Thus, the operation ® is commutative. 

To determine whether addition is associative, we evaluate and compare the 
expressions 
(a&b) Ge and ad (b@c) 

In this case, we have 


(a ® b) 9 c = a+ 2b) Gc and a Ð (b Gc) = a @ (2b + 2c) 
= 2(2a + 2b) + 2c = 2a + 2(2b + 2c) 
= 4a + 4b + 2c = 2a + 4b + 4c 


We see that the two final expressions are not equal for all choices of a, b, and 
c. Therefore, the associative property is not upheld, and V is not a vector space. 


Let V = R. Define addition and scalar multiplication by 
a@b=a? and =kQa=ka 

Show that V is not a vector space. 

In this case 


a®@b=a’ and b@a=b" 


Since a? Æ b° for all choices of a and b, the commutative property of addition is 
not upheld, and V is not a vector space. 


In Example 5 we show that familiar sets with nonstandard definitions for addition 
and scalar multiplication can be vector spaces. 


132 Chapter 3 Vector Spaces 


| EXAMPLES | Let V = {(a, b) | a, b € R}. Let v = (vı, v2) and w = (w1, w2). Define 


(v1, V2) ® (wy, Wo) = (vı + wy + 1, v2 + w2 + 1) and 
c © (vi, v2) = (cvi +e Icu +e — 1) 


Verify that V is a vector space. 


Solution First observe that since the result of addition or scalar multiplication is an ordered 
pair, V is closed under addition and scalar multiplication. Since addition of real 
numbers is commutative and associative, axioms 2 and 3 hold for the @ defined 
here. Now an element w € V is the additive identity provided that for all v € V 


VOw=VvV_ or (vı + wi + 1, v2 + w2 + 1) = (v1, v2) 
Equating components gives 
vi +wi+1 =v and v + w+ 1 = v so 
Wi =i and w2 = -l1 


This establishes the existence of an additive identity. Specifically, 0 = (—1, —1), 
so axiom 4 holds. 
To show that each element v in V has an additive inverse, we must find a 
vector w such that 
v®w=0= (-1,-1) 
Since v ® w = (vı + w1 + 1, v2 + w2 + 1), this last equation requires that 


vi +wi+ 1 = -—1 and v2 + w + l= —1 so that 
Wy =-v,—2 and w2 = —v-—2 
Thus, for any element v = (v1, v2) in V, we have —v = (—v; — 2, —v2 — 2). The 
remaining axioms all follow from the similar properties of the real numbers. 


A polynomial of degree n is an expression of the form 
p(x) = ag + axr + agx? H + an anx" 


where ag, ..., an are real numbers and a, 4 0. The degree of the zero polynomial is 
undefined since it can be written as p(x) = Ox” for any positive integer n. Polynomials 
comprise one of the most basic sets of functions and have many applications in 
mathematics. 


| EXAMPLE6 | Vector Space of Polynomials Letn be a fixed positive integer. Denote by P, 


the set of all polynomials of degree n or less. Define addition by adding like terms. 
That is, if 
P(x) = ao + aix + ax? + +++ + pix"! + anx" 


Solution 


Solution 


3.1 Definition of a Vector Space 133 


and 
q(x) = bo + bix + box” +--+ + bn-1x"7! + bax” 
then 


P(x) ® q(x) = (ao + bo) + (a1 + bi)x + (a2 + by) x* + - -- + (an + by)” 
If c is a scalar, then scalar multiplication is defined by 
c © p(x) = cap + cayx + carx +> F can ax | ee 


Verify that V = P, U {0} is a real vector space, where 0 is the zero polynomial. 


Since the sum of two polynomials of degree n or less is another polynomial of 
degree n or less, with the same holding for scalar multiplication, the set V is closed 
under addition and scalar multiplication. The zero vector is just the zero polynomial, 
and the additive inverse of p(x) is given by 


2 1 


— p(x) = —aọ — a,x — ax — +++ — an-1 X" — ay x" 


The remaining axioms are consequences of the properties of real numbers. For 
example, 


P(x) ® q(x) = (ao + bo) + (ar + bi) x + (a2 + ba) x? +- -- + (an + Bn) x” 
= (bo + ao) + (bi + a1)x + (b2 + aa) x? +++» + (bn + Gn) x” 
= q(x) Ð p(x) 


In the sequel we will use P„ to denote the vector space of polynomials of degree 
n or less along with the zero polynomial. 

The condition degree n or less cannot be replaced with all polynomials of degree 
equal to n. The latter set is not closed under addition. For example, the polynomials 
x? — 2x +1 and —x*+3x +4 are both polynomials of degree 2, but the sum is 
x + 5, which has degree equal to 1. 


Vector Space of Real-Valued Functions Let V be the set of real-valued 
functions defined on a common domain given by the interval [a, b]. For all f and 
g in V and c € R, define addition and scalar multiplication, respectively, by 


(f ® g)(x) = f(x) + g@) and (cO f)&) = cf (x) 
for each x in [a, b]. Show that V is a real vector space. 
Since the pointwise sum of two functions with domain [a, b] is another function 


with domain [a, b], the set V is closed under addition. Similarly, the set V is closed 
under scalar multiplication. 


134 Chapter 3. Vector Spaces 


EXAMPLE 8 


To show that addition in V is commutative, let f and g be functions in V. 
Then 


(f ® g)(x) = f(x) + g(x) = g(x) + fF) = (8 fa) 
Addition is also associative since for any functions f, g, and h in V, we have 
(f @ (8 h)a) = fx) + (g OAV) 
= f(x) + g(x) + h(x) 
= (f ® g(x) + A(x) 
= ((f ® g) @A)(x) 
The zero element of V, denoted by 0, is the function that is O for all real 
numbers in [a, b]. We have that 0 is the additive identity on V since 
FDO) = f(x) + 0%) = f(x) 
Next we let c and d be real numbers and let f be an element of V. The 
distributive property of real numbers gives us 
(c+ 4) © f(x) = (c+ da) f(x) =cf(x) + df (x) 
= (CO fi) eado fx) 


so (c+d)© f =(cO f)®(dO© f), establishing property 8. 
The other properties follow in a similar manner. 


The set of complex numbers, denoted by C, is defined by 
C= {a+ bi | a,b € R} 
where i satisfies 
i? = — or equivalently i=v-1 

The set of complex numbers is an algebraic extension of the real numbers, which 
it contains as a subset. For every complex number z = a + bi, the real number a is 
called the real part of z and the real number b the imaginary part of z. 

With the appropriate definitions of addition and scalar multiplication, the set of 
complex numbers C is a vector space. 


Vector Space of Complex Numbers Let z = a + bi and w = c + di be ele- 
ments of C and a a real number. Define vector addition on C by 


z® w= (a + bi) + (c + di) = (a + c) + (b + d)i 
and scalar multiplication by 
a©Oz=aoeo (a + bi) = aa + (ab)i 


Verify that C is a vector space. 


3.1 Definition of a Vector Space 135 


Solution For each element z = a + bi in C, associate the vector in R? whose components 
are the real and imaginary parts of z. That is, let 


b 


Observe that addition and scalar multiplication in C correspond to those same 
operations in R*. In this way C and R? have the same algebraic structure. Since 
R? is a vector space, so is C. 


z=a+bi e |G | 


In Example 8, we showed that C is a vector space over the real numbers. It is 
also possible to show that C is a vector space over the complex scalars. We leave the 
details to the reader. 

Example 9 is from analytic geometry. 


| EXAMPLE 9 | Let a, b, and c be fixed real numbers. Let V be the set of points in three-dimensional 


Euclidean space that lie on the plane P given by 
ax+by+cz=0 


Define addition and scalar multiplication on V coordinatewise. Verify that V is a 
vector space. 


Solution To show that V is closed under addition, let u = (u1, u2, u3) and v = (v1, v2, v3) 
be points in V. The vectors u and v are in V provided that 


au, + buz + cu3 = 0 and avı + bv + cv = 0 
Now by definition 
U®V= (u1 + v1, U2 + V2, U3 + v3) 
We know that u @ v is in V since 
a(uı + v1) + b(u2 + v2) + c(u3 + v3) = aui + av, + bug + bv2 + cu3 + cv3 


(au, + buz + cu3) + (avı + bv2 + cv3) 
= 


Similarly, V is closed under scalar multiplication since for any scalar a, we have 


a Ou = (Quy, AU2, AU3) 
and 
a(au,) + b(aur) + c(au3) = alau; + buz + cuz) = a(0) = 0 


In this case the zero vector is (0,0, 0), which is also on the plane P. Since 
the addition and scalar multiplication defined on V are the analogous operations 
defined on the vector space R?, the remaining axioms are satisfied for elements of 
V as well. 


136 Chapter 3 Vector Spaces 


We conclude this section by showing that some familiar algebraic properties of 
R” extend to abstract vector spaces. 


THEOREM 1 In a vector space V, additive inverses are unique. 


Proof Let u be an element of V. Suppose that v and w are elements of V and 
both are additive inverses of u. We show that v = w. Since 


u@v=0 and u@w=0 


axioms 4, 3, and 2 give 


v=ve0=veUdw)=(VGou) Ow=00wW=wW 


establishing the result. 


THEOREM 2 Let V be a vector space, u a vector in V, and c a real number. 
1.00u=0 
2.cO0=0 
3. (—1) Ou = —u 
4. If cOu=O0, then either c = 0 or u = 0. 
Proof (1) By axiom 8, we have 
0Ou= (04+0)0u= (00u) (00u) 


Adding the inverse —(0 © u) to both sides of the preceding equation gives the 
result. 
(2) By axiom 4, we know that 0 ® 0 = 0. Combining this with axiom 7 gives 


cO©0=cO 0690) = (ON G(COD) 


Again adding the inverse —(c © 9) to both sides of the last equation gives the 
result. 
(3) By axioms 10 and 8 and part 1 of this theorem, 


u (-1)O©u= (10u 6[(-]) Ou] 


=(1-1)©u 
=00u 
=0 


Thus, (—1) © u is an additive inverse of u. Since —u is by definition the addi- 
tive inverse of u and by Theorem 1 additive inverses are unique, we have 
(1) Ou = —u. 
(4) Let c © u = 0. If c = 0, then the conclusion holds. Suppose that c # 0. Then 
multiply both sides of 

cOu=0 


3.1 Definition of a Vector Space 137 


1 
by — and apply part 2 of this theorem to obtain 
c 


1 
-O(cOu=0 so that 10©0u=0 
C 


and hence u = 0. 


Fact Summary 


1. To determine whether a set V with addition and scalar multiplication 
defined on V is a vector space requires verification of the 10 vector space 
axioms. 

. The Euclidean space R” and the set of matrices Mmxn, with the standard 
componentwise operations, are vector spaces. The set of polynomials of 
degree n or less, including the zero polynomial, with termwise operations 
is a vector space. 

. In all vector spaces, additive inverses are unique. Also 


0Ou=0 cO0=0 and (-1)©u=-—u 


In addition if c © u = 0, then either the scalar c is the number O or the 
vector u is the zero vector. 


Exercise Set 3.1 


In Exercises 1—4, let V = R?. Show that V with the X1 X2 2x1 + 2x2 
given operations for ® and © is not a vector space. 3, | vy | ©] y | = |] 291+ 2y2 
x x9 po Z1 Z2 221 + 2z2 
L |y | ©] w |} =] z- 
ral z2 Z1 — z2 m eel 
cO} y |} =] cy 
| Z1 | | CZ1 
Xi CX1 
of |=[5 x] x2 x1 + x2 
a = 4) ym | 8] x| =| zt 
Z1 Z2 £1 22 
X1 X2 xı +x: — 1 
2. | y e[s |- [3t] xı c+x 
Z1 22 z +g- cOl|y |= yı 
Z1 Z1 
xı cxi . . 2 
col y | =| o 5. Write out all 10 vector space axioms to show R 
A či with the standard componentwise operations is a 


vector space. 


138 


6. 


10. 


11. 


12. 


Chapter 3 Vector Spaces 


Write out all 10 vector space axioms to show that 
M>,.2 with the standard componentwise operations 
is a vector space. 


. Let V = R? and define addition as the standard 


componentwise addition and define scalar 
multiplication by 


cele 


Show that V is not a vector space. 


. Let 


a 
V= b a,beR 
1 


a. With the standard componentwise operations 


show that V is not a vector space. 


b. If addition and scalar multiplication are defined 


componentwise only on the first two 
components and the third is always 1, show 
that V is a vector space. 


. Let V = R? and define 


ole ]=[ 


Determine whether V is a vector space. 


mlesce 


and let addition and scalar multiplication be the 
standard operations on vectors. Determine 
whether V is a vector space. 

te R} 


a 


and let addition and scalar multiplication be the 
standard operations on vectors. Determine 
whether V is a vector space. 


Let 


Let 


Let 


and let addition and scalar multiplication be the 
standard componentwise operations. Determine 
whether V is a vector space. 


13. Let 
a b 
E 


a. If addition and scalar multiplication are the 
standard componentwise operations, show that 
V is not a vector space. 


b. Define 


ab cer} 


a b ka kb 
o | cl z | ke 1 
Show that V is a vector space. 


In Exercises 14—19, let V be the set of 2 x 2 matrices 
with the standard (componentwise) definitions for 
vector addition and scalar multiplication. Determine 
whether V is a vector space. If V is not a vector space, 
show that at least one of the 10 axioms does not hold. 


14. Let V be the set of all skew-symmetric matrices, 
that is, the set of all matrices such that A’ = —A. 


15. Let V be the set of all upper triangular matrices. 


16. Let V be the set of all real symmetric matrices, 
that is, the set of all matrices such that A‘ = A. 


17. Let V be the set of all invertible matrices. 
18. Let V be the set of all idempotent matrices. 


19. Let B be a fixed matrix, and let V be the set of 
all matrices A such that AB = 0. 


20. Let 
SE a 
c —a 


and define addition and scalar multiplication as 
the standard componentwise operations. 
Determine whether V is a vector space. 


a.bceR} 


21. Let V denote the set of 2 x 2 invertible matrices. 
Define 


A®B=AB cOA=cA 


a. Determine the additive identity and additive 
inverse. 


b. Show that V is not a vector space. 


22. Let 
t 
ay rer} 


Define 


O t E ct 
7 Cake |= | i+ct 


a. Find the additive identity and inverse. 
b. Show that V is a vector space. 
c. Verify that 0 © v = 0 for all v. 


23. Let 
1+ t 
V= 2— t teR 
34 2t 
Define 
1+ f 1+ b 
2— t @®| 2- hb 
3+2tı 3+2b 
1+ (4) +h) 
=| 2- (ti +h) 
3 + (2t + 2t) 
1+ t 1+ ct 
cO| 2- ¢t}]= | 2- ct 
3+ 2t 3+ 2ct 


a. Find the additive identity and inverse. 
b. Show that V is a vector space. 
c. Verify 0© v = 0 for all v. 


24. Let 
1 2 
u=] 0 v=] -l 
1 
and 


S = {au + bv|a,b € R} 


25. 


26. 


27. 


28. 


3.1 Definition of a Vector Space 139 
Show that S with the standard componentwise 
operations is a vector space. 
Let v be a vector in R”, and let 

S = {v} 

Define ® and © by 

VOv=Vv cCOV=Vv 
Show that S is a vector space. 


Let 
x 
S= y 
Z 


3x —2y+z=0 


Show that S with the standard componentwise 
operations is a vector space. 


Let S be the set of all vectors 


Xx 


y 
Zz 


in R? such that x + y — z = 0 and 
2x — 3y + 2z = 0. Show that S with the standard 
componentwise operations is a vector space. 


ee rer} 

sint 

| cos fı e | COS fp | 
sin ty sin h 


cos(ti + h) 
sin(tı + t2) 


cost cos ct 
cO R = l 
sin t sin ct 


Let 


and define 


a. Determine the additive identity and additive 


inverse. 


b. Show that V is a vector space. 


c. Show that if @ and © are the standard 
componentwise operations, then V is not a 
vector space. 


140 


Chapter 3 Vector Spaces 


29. Let V be the set of all real-valued functions Determine whether V is a vector space. 


30. 


defined on R with the standard operations that 
satisfy f (0) = 1. Determine whether V is a 


vector space. 


31. Let f(x) = x? defined on R and let 
V={f@tn|teR} 


Let V be the set of all real-valued functions Define 


defined on R. 
Define f ® g by 


(F D8) = fa) +ga) 


and define c © f by 


fx+t) ®fath) = fett +h) 


cOfa+t) = fatct) 
a. Determine the additive identity and additive 
inverses. 


CO fE) = fæ +e) b. Show that V is a vector space. 


3.2 > Subspaces 


DEFINITION 1 


Many interesting examples of vector spaces are subsets of a given vector space V that 
are vector spaces in their own right. For example, the xy plane in R? given by 


x 
y x,yeER 
0 


is a subset of R°. It is also a vector space with the same standard componentwise 
operations defined on R°. Another example of a subspace of a vector space is given 
in Example 9 of Sec. 3.1. The determination as to whether a subset of a vector space 
is itself a vector space is simplified since many of the required properties are inherited 
from the parent space. 


Subspace A subspace W of a vector space V is a nonempty subset that is itself 
a vector space with respect to the inherited operations of vector addition and scalar 
multiplication on V. 


The first requirement for a subset W C V to be a subspace is that W be closed 
under the operations of V. For example, let V be the vector space R? with the 
standard definitions of addition and scalar multiplication. Let W C R? be the subset 


defined by 
v-{(3] +=) 


Observe that the sum of any two vectors in W is another vector in W, since 


fojelo}=[%0"| 


W is not a subspace of V 


Figure 1 


THEOREM 3 


3.2 Subspaces 141 


In this way we say that W is closed under addition. The subset W is also closed under 
scalar multiplication since for any real number c, 


a ca 
olo|-[9] 
which is again in W. 


On the other hand, the subset 


which is not in W. See Fig. 1. The subset W is also not closed under scalar multipli- 
cation since 
ele] 
1 c 


which is not in W for all values of c Æ 1. 

Now let us suppose that a nonempty subset W is closed under both of the oper- 
ations on V. To determine whether W is a subspace, we must show that each of the 
remaining vector space axioms hold. Fortunately, our task is simplified as most of 
these properties are inherited from the vector space V. For example, to show that the 
commutative property holds in W, let u and v be vectors in W. Since u and v are 
also in V, then 

upv=vu 
Similarly, any three vectors in W satisfy the associative property, as this property is 
also inherited from V. To show that W contains the zero vector, let w be any vector 
in W. Since W is closed under scalar multiplication, 0 © w € W. Now, by Theorem 2 
of Sec. 3.1, we have 0 © w = 0. Thus, 0 € W. Similarly, for any w € W, 


(-l)Ow=-w 


is also in W. All the other vector space properties, axioms 7 through 10, are inherited 
from V. This shows that W is a subspace of V. Conversely, if W is a subspace of 
V, then it is necessarily closed under addition and scalar multiplication. This proves 
Theorem 3. 


Let W be a nonempty subset of the vector space V. Then W is a subspace of V if 
and only if W is closed under addition and scalar multiplication. 


By Theorem 3, the first of the examples above with 


v={[6 ]lee} 


142 Chapter 3 Vector Spaces 


Solution 


is a subspace of R? while the second subset 


7 eA 


For any vector space V the subset W = {0}, consisting of only the zero vector, 
is a subspace of V, called the trivial subspace. We also have that any vector space 
V, being a subset of itself, is a subspace. 


w= ven) 


be a subset of the vector space V = R? with the standard definitions of addition 


and scalar multiplication. Determine whether W is a subspace of V. 


In light of Theorem 3, we check to see if W is closed under addition and scalar 


multiplication. Let 
u v 
a= N and F 


be vectors in W. Adding the vectors gives 
u v 
wov=| 4, ]e[ iii | 


a u+uvu 
~ | u+v+2 


This last vector is not in the required form since 
u+v+2#u+v+1 


and hence we see that u ® v is not in W. Thus, W is not a subspace of V. 


It is sometimes easy to show that a subset W of a vector space is not a subspace. 
In particular, if 0 W or the additive inverse of a vector is not in W, then W is not a 
subspace. In Example 1, W is not a subspace since it does not contain the zero vector. 


The trace of a square matrix is the sum of the entries on the diagonal. Let M2x2 
be the vector space of 2 x 2 matrices with the standard operations for addition and 
scalar multiplication, and let W be the subset of all 2 x 2 matrices with trace 0, 


that is, 
a b 
=| 


Show that W is a subspace of M2x2. 


a+d=0} 


Solution 


Solution 


Solution 


3.2 Subspaces 143 


m=|¢ a and w=% | 


be matrices in W, so that aj + dı = 0 and a + d2 = 0. The sum of the two matri- 
ces is 


a, bi e a bp | | ata, OCE 
ca d o d| | ata d+d 


Since the trace of wy ® w2 is 


(aı + az) + (di +d) = (a Fan (a2 + do) = 0 


mom=| 


then W is closed under addition. Also, for any scalar c, 


a ay bı A ca, chy 
com=co| 4 yale |S A 


The trace of this matrix is ca; + cd; = c(a,; + dı) = 0. Thus, W is also closed 
under scalar multiplication. Therefore, W is a subspace of M2x2. 


Let W be the subset of V = Mnxn consisting of all symmetric matrices. Let the 
operations of addition and scalar multiplication on V be the standard operations. 
Show that W is a subspace of V. 


Recall from Sec. 1.3 that a matrix A is symmetric provided that A’ = A. Let A 
and B be matrices in W and c be a real number. By Theorem 6 of Sec. 1.3, 
(ASB! =A @B'=A@B and (cOA)l=cOA'=cOA 


Thus, W is closed under addition and scalar multiplication, and consequently, by 
Theorem 3, W is a subspace. 


Let V = M, xn with the standard operations and W be the subset of V consisting 
of all idempotent matrices. Determine whether W is a subspace. 


Recall that a matrix A is idempotent provided that A? = A (See Exercise 42 of 
Sec. 1.3.) Let A be an element of W, so that A? = A. Then 


(CQ AP = (CAP = cA =CPA=HCOA 


so that 

(COAYP=cOA  ifandonlyif c=c 
Since this is not true for all values of c, then W is not closed under scalar multi- 
plication and is not a subspace. 


144 Chapter 3. Vector Spaces 


The two closure criteria for a subspace can be combined into one as stated in 
Theorem 4. 


THEOREM 4 A nonempty subset W of a vector space V is a subspace of V if and only if for 
each pair of vectors u and v in W and each scalar c, the vector u ® (c © v) is in W. 


Proof Let W be a nonempty subset of V, and suppose that u @ (c © v) belongs 
to W for all vectors u and v in W and all scalars c. By Theorem 3 it suffices to 
show that W is closed under addition and scalar multiplication. Suppose that u and 
v are in W; then u È (1 Ov) =u @ vis in W, so that W is closed under addition. 
Next, since W is nonempty, let u be any vector in W. Then 0 =u È [(—1) © ul, 
so that the zero vector is in W. Now, if c is any scalar, then cOu=00 (c © u) 
and hence is in W. Therefore, W is also closed under scalar multiplication. 

Conversely, if W is a subspace with u and v in W, and c a scalar, then since W 
is closed under addition and scalar multiplication, we know that u@ (c © v) is 
in W. 


| EXAMPLES | Let W be the subset of R? defined by 


—2t 


Use Theorem 4 to show that W is a subspace. 


Solution Let u and v be vectors in W and c be a real number. Then there are real numbers 
p and q such that 


3p 3q 
u (cOv)= 0|]@ co 0 
—2p —2q 
3(p + cq) 
= 0 
—2(p + cq) 


As this vector is in W, by Theorem 4, W is a subspace. 
Alternatively, the set W can be written as 


3 


which is a line through the origin in R°. 


THEOREM 5 


Solution 


3.2 Subspaces 145 


We now consider what happens when subspaces are combined. In particular, 
let W; and Wz be subspaces of a vector space V. Then the intersection W; N W2 is 
also a subspace of V. To show this, let u and v be elements of W; N W2 and let c 
be a scalar. Since W; and W, are both subspaces, then by Theorem 4, u @ (c © v) is 
in W, and is in W2, and hence is in the intersection. Applying Theorem 4 again, we 
have that W; N W2 is a subspace. 

The extension to an arbitrary number of subspaces is stated in Theorem 5. 


The intersection of any collection of subspaces of a vector space is a subspace of 
the vector space. 


Example 6 shows that the union of two subspaces need not be a subspace. 


Let W; and W, be the subspaces of R? with the standard operations given by 


w-{[s]|ren} mt vee] 509 


Show that W; U W3 is not a subspace. 
The subspaces W, and W3 consist of all vectors that lie on the x axis and the y axis, 
respectively. Their union is the collection of all vectors that lie on either axis and 


is given by 
W,U m=4/> | 


This set is not closed under addition since 


[ole lee 


which is not in W; U W2, as shown in Fig. 2. 


x=0ory=o} 


Figure 2 


146 


Chapter 3 Vector Spaces 


DEFINITION 2 


DEFINITION 3 


PROPOSITION 1 


Span of a Set of Vectors 


Subspaces of a vector space can be constructed by collecting all linear combinations of 
a set of vectors from the space. These subspaces are used to analyze certain properties 
of the vector space. A linear combination is defined in abstract vector spaces exactly 
as it is defined in R” in Chap. 2. 


Linear Combination Let S = {vj, vo,..., Vg} be a set of vectors in a vector 
space V, and let c1, co,..., Cg be scalars. A linear combination of the vectors of 
S is an expression of the form 


(c1 O V1) ® (c2 © V2) B®: ® (ck © Ve) 


When the operations of vector addition and scalar multiplication are clear, we 
will drop the use of the symbols @ and ©. For example, the linear combination given 
in Definition 2 will be written as 


k 
C1V1 + C2V2 + ORV = X CiVi 
i=l 


Unless otherwise stated, the operations on the vector spaces R”, Minxn, Pn, and 
their subspaces are the standard operations. Care is still needed when interpreting 
expressions defining linear combinations to distinguish between vector space opera- 
tions and addition and multiplication of real numbers. 


Span of a Set of Vectors Let V be a vector space and let S = {vj,..., Vn} be 
a (finite) set of vectors in V. The span of S, denoted by span(S), is the set 


span(S) = {civ} + c2V2 +++++CnVn | C1, C2,...,Cn © R} 


If S = {v1, V2,..., Vn} is a set of vectors in a vector space V, then span(S) is a 
subspace. 


Proof Let u and w be vectors in span(S) and c a scalar. Then there are scalars 
C1, ...,Cn and di, ..., d, such that 


U + CW = (C1V1 +++ + CnVn) + cldivi +-+ + dnYn) 
= (cy + cd1)v1 +-+- + (Cn + cdn)Vn 


Therefore, u + cw is in span(S), and hence the span is a subspace. 


3.2 Subspaces 


| EXAMPLE7 | Let S be the subset of the vector space R? defined by 


Solution 


2 1 1 
Se —-1 |, 3) ||, || il 
0 —2 4 
Show that 
—4 
y= 4 
—6 


is in span(S). 


To determine if v is in the span of S, we consider the equation 


2 1 1 —4 
Cl —1 + c2 3 ar C3 1 = 4 
0 —2 4 —6 


Solving this linear system, we obtain 


a == ey = Il and c3 = —1 


147 


This shows that v is a linear combination of the vectors in S and is thus in span(S). 


The span of a single nonzero vector in R” is a line through the origin, and the span 
of two linearly independent vectors is a plane through the origin as shown in Fig. 3. 


AY 
span{v} 


3v 


—3.5v 


Figure 3 


ue 
/4Y1 span{vı, v2} 


Since every line through the origin in R? and R3, and every plane through the 


origin in R*, can be written as the span of vectors, these sets are subspaces. 


148 Chapter 3 Vector Spaces 


i s= {a 3) ee 


0 0 0 0 1 
Show that the span of S is the subspace of M2x2 of all symmetric matrices. 


Solution Recall that a 2 x 2 matrix is symmetric provided that it has the form 


[oe 


Since any matrix in span(S) has the form 


efo ol*#[t oftelo rf=[s «| 


span(S) is the collection of all 2 x 2 symmetric matrices. 


1 1 1 


span 1 |] @ |, |) a =R? 
1 2 0 
Solution Let 
a 
v= || & 
Cc 


be an arbitrary element of R*. The vector v is in span(S) provided that there are 
scalars c1, C2, and c3 such that 


1 1 1 a 
Cl 1 +c | 0 ar (63 1 = b 
1 2 0 c 
This linear system in matrix form is given by 
ti it | @ 
1 0 1ļb 
T 4% Olle 


After row-reducing, we obtain 


1 0 0} —2a+2b+c 
0 1 0 a— b 
0 0 1 


Solution 


7x —y+9z=0 
Figure 4 


3.2 Subspaces 149 


From this final augmented matrix the original system is consistent, having solution 
cı = —2a + 2b + c, c2 =a — b, and c3 = 2a — b — c, for all choices of a, b, and 
c. Thus, every vector in R? can be written as a linear combination of the three 
given vectors. Hence, the span of the three vectors is all of R°. 


Show that 
=| 4] | =6 
span D |. il |. 3 ZR 
1 — 5 


We approach this problem in the same manner as in Example 9. In this case, 
however, the resulting linear system is not always consistent. We can see this by 
reducing the augmented matrix 


-1 4 —6ja -1 4 -6 a 
2 1 3| to 0 9 -9) b+2a 
T= “Si 0 0 Ofc+ga- 5b 


This last augmented matrix shows that the original system is consistent only if 
7a — b + 9c =0. This is the equation of a plane in 3-space, and hence the span is 
not all of R3. See Fig. 4. 

Notice that the solution to the equation 7a — b + 9c = O can be written in 


parametric form by letting b = s, c = t, and a = ts = 2t, so that 


—1 4 —6 4 -2 
span D |, ae 3 =4s]| 1 |+t 0 s,teR 
1 —3 5 0 1 


In this way, we see that the span is the subspace of all linear combinations of 
two linearly independent vectors, highlighting the geometric interpretation of the 
solution as a plane. 


With Examples 9 and 10 we have completed the groundwork for the notion of a 
basis, which is central to linear algebra and is the subject of Sec. 3.3. Specifically, in 
Example 9, we saw that the set of vectors 


1 1 1 
S = {v], V2, v3} = 1 í 0 á 1 


150 


Chapter 3 Vector Spaces 


spans R?. These vectors are also linearly independent. To see this, observe that the 
matrix 


1 1 
A=j;1 0 1 
2 0 


whose column vectors are the vectors of S, is row equivalent to the 3 x 3 identity 
matrix, as seen in the solution to Example 9. [Another way of showing that S is 
linearly independent is to observe that det(A) = 1 4 0.] Consequently, by Theorem 7 
of Sec. 2.3, we have that every vector in R? can be written in only one way as a 
linear combination of the vectors of S. 

On the other hand, the span of the set of vectors 


=) 4 —6 
S = {vi v, v} = Falls 1 |, 3 
1 -3 5 


of Example 10 is a plane passing through the origin. Hence, not every vector in R? 
can be written as a linear combination of the vectors in S’. As we expect, these vectors 
are linearly dependent since 


-1 4 —6 
det 2 1 3ļ|ļ]=o0 
1 -3 5 


In particular, v4 = 2v} — v4. The vectors v and v are linearly independent vectors 
which span the plane shown in Fig. 4, but not R°. 

To pursue these notions a bit further, there are many sets of vectors which span 
R°. For example, the set 


1 0 0 1 
B= {e, e2, e3, V} = Oj, | 1 |, | 0 |, 
0 0 1 3 


spans R?, but by Theorem 3 of Sec. 2.3 must necessarily be linearly dependent. The 
ideal case, in terms of minimizing the number of vectors, is illustrated in Example 9 
where the three linearly independent vectors of S span R°. In Sec. 3.3 we will see 
that S is a basis for R?, and that every basis for R? consists of exactly three linearly 
independent vectors. 


Show that the set of matrices 


si] 2 alban 


does not span M2x2. Describe span(S). 


Solution 


Solution 


3.2 Subspaces 


The equation 
-1 0 toi | @ e@ 
aj 2 4 | e 


is equivalent to the linear system 


— Gila) SA 
Gy = 
Mee) =€ 
Cc 
From these equations we see that 
ei, = al and o 


which gives 


Therefore, 


Show that the set of polynomials 


S = (x? + 27 ie E a) 


spans the vector space P2. 


151 


An arbitrary vector in Pz can be written in the form ax” + bx +c. To determine 
ary 


whether the span(S) = P2, we consider the equation 
cı (x? + 2x + 1) Bric +2) + c3x = ax? +bx+c 
which simplifies to 


(cy + c2)x? + (2c, + c3)x + (cy + 2c2) = ax? +bx+c 


Since two polynomials are equal if and only if the coefficients of like terms are 
equal, equating coefficients in the previous equation gives, in matrix form, the linear 


system 


This matrix reduces to 


1 00 
0 1 0 —a+c 
00 1 


152 Chapter 3 Vector Spaces 


DEFINITION 4 


THEOREM 6 


Hence, the linear system has the unique solution cı = 2a — c, c2 = —a + c, and 
c3 = —4a + b + 2c, for all a, b, and c. Therefore, span(S) = P2. 


The Null Space and Column Space of a Matrix 


Two special subspaces associated with every matrix A are the null space and column 
space of the matrix. 


Null Space and Column Space Let A be an m x n matrix. 


1. The null space of A, denoted by N(A), is the set of all vectors in R” such 
that Ax = 0. 

2. The column space of A, denoted by col(A), is the set of all linear combinations 
of the column vectors of A. 


Observe that N(A) is a subset of R” and col(A) is a subset of R”. Moreover, 
by Proposition 1, col(A) is a subspace of R”. Using this terminology, we give a 
restatement of Theorem 2 of Sec. 2.2. 


Let A be an m x n matrix. The linear system Ax = b is consistent if and only if 
b is in the column space of A. 


1 -l -2 3 


Solution 


A= | = 2 3 and b= 1 
2 —2 —2 —2 


a. Determine whether b is in col(A). 


b. Find N(A). 


a. By Theorem 6, the vector b is in col(A) if and only if there is a vector x such 
that Ax = b. The corresponding augmented matrix is given by 


1 —1 -—2 3 i oO 0 3 
—1 2 3 1 which reduces to 0 1 0 8 
2 —2 —2 |-2 0 0 1] -4 


3.2 Subspaces 153 


Hence, the linear system Ax = b is consistent, and vector b is in col(A). 


Specifically, 
3 1 —1 —2 
i= 3 | il | +3 2)|—-4 3 
—2 2 —2 —2 


b. To find the null space of A, we solve the homogeneous equation Ax = 0. 
The corresponding augmented matrix for this linear system is the same as in 
part (a), except for the right column that consists of three zeros. Consequently, 
the only solution is the trivial solution and hence N(A) = {0}. 


In Theorem 7 we show that the null space of a matrix also is a subspace. 


THEOREM 7 Let A be an m x n matrix. Then the null space of A is a subspace of R”. 


Proof The null space of A is nonempty since 0 is in N(A). That is, A0 = 0. 
Now let u and v be vectors in N(A) and c a scalar. Then 


A(u + cv) = Au + A(cv) 
= Au+cA(v) 
=0+c0=0 


Hence, u + cv is in N (A), and therefore by Theorem 4, N (A) is a subspace. 


Fact Summary 


Let V be a vector space and W a nonempty subset of V. 


. To verify that W is a subspace of V, show that u ® c © v is in W for any u 
and v in W and any scalar c. 


. The span of a set of vectors from V is a subspace. 


. The span of a single nonzero vector in R? or R? is a line that passes 
through the origin. The span of two linearly independent vectors in R? is a 


plane that passes through the origin. These sets are subspaces. 

. The intersection of subspaces is a subspace. The union of two subspaces 
may not be a subspace. 

. If A is an m x n matrix, the null space of A is a subspace of R” and the 
column space of A is a subspace of R”. 

. The linear system Ax = b is consistent if and only if b is in the column 
space of A. 


154 Chapter 3 Vector Spaces 


Exercise Set 3.2 


In Exercises 1—6, determine whether the subset S of 
R? is a subspace. If S is not a subspace, find vectors u 
and v in S such that u + v is not in S; or a vector u 
and a scalar c such that cu is not in S. 


L(I 


14. 
15. 
16. 
17. 
18. 


Let S be the set of all skew-symmetric matrices. 
Let S be the set of all upper triangular matrices. 
Let S be the set of all diagonal matrices. 

Let S be the set of all matrices with az = 0. 


Let S be the set of all matrices with 
aii +422 = 0. 


In Exercises 19—24, determine whether the subset S of 
Ps is a subspace. 


19. Let S be the set of all polynomials with degree 


2 s={[3]] 29 
ssip +29} 
os {[F]] sees 
s.s={ 57, | xer} 


a s= {| a] eR} 


In Exercises 7—10, determine whether the subset S of 
R? is a subspace. 


x, + X3 = —2 


X1X2X3 = 0 


10. S = 2 
X3 


x1, X3 > 0 


In Exercises 11—18, determine whether the subset S of 
M2x2 is a subspace. 


11. Let S be the set of all symmetric matrices. 
12. Let S be the set of all idempotent matrices. 


13. Let S be the set of all invertible matrices. 


20. 


21. 


22. 


23. 


equal to 3. 


Let S be the set of all polynomials with even 
degree. 


Let S be the set of all polynomials such that 
p(0) =0. 


Let S be the set of all polynomials of the form 
p(x) = ax’. 


Let S be the set of all polynomials of the form 


p(x) = ax? +1. 
24. Let S be the set of all polynomials of degree less 
than or equal to 4. 


In Exercises 25 and 26, determine if the vector v is in 
the span of 


1 —1 —1 

S= 1j, | —1 |, 2 

0 1 0 
1 
25. v= | —1 
1 
—2 
26. v = 7 
—3 


In Exercises 27 and 28, determine if the matrix M is 
in the span of 


Se eee 


In Exercises 29 and 30, determine if the polynomial 


p(x) is in the span of 


29, p(x) = 2x? — 6x — 11 


30. 3x2 -x —4 


In Exercises 31—36, give an explicit description of the 


span of S. 


31. S= 


1 

3 

—1 
2 1 
eh ls 2 
1 —1 


35. S= {x, œ +1), x? +3x +1} 
36. S= {x7 — 4,2- x, x° +x +2} 


S= {1 +x, x? — 2, 3x} 


In Exercises 37—40, a subset S of R? is given. 


a. Find span(S). 


b. Is S linearly independent? 


2 
37. S = 1 
-1 
1 
38. S = 1 
2 
3 
39. S = 3 
2 


3 

0 

=2 
0 2 
—1 |,| 5 
1 1 
0 1 
1 |,} -l 
0 -1 


40. S= 


41. Let 


a. Find span(S). 


3.2 Subspaces 


no) NNR 
| 
— 
| 
— 


b. Is S linearly independent? 


c. Let 


1 =f 1 
Deviate cage Ws 
2 = = 
= 

4 

5 


155 


Is span(7) = R?? Is T linearly independent? 


d. Let 


H = 


1 —1 1 
2- 3 |, 2 
2 —1 -1 


Is span(H) = R?? Is H linearly independent? 


42. Let 


-fli 


a. Find span(S). 


b. Is S linearly independent? 


Let 


Is span(T) = M2x2? Is T linearly 


independent? 


156 


43. 


44. 


Chapter 3 Vector Spaces 


Let 
S = {1,x — 3, x? + 2x, 2x7 +3x +5} 


a. Find span(S). 


b. Is S linearly independent? 


c. Show that 2x? + 3x + 5 is a linear combination 


of the other three polynomials in S. 


d. Let T = {1, x — 3, x? + 2x, x3}. Is T linearly 
independent? Is span(T) = P3? 


Let 
2s—t 
s 
t 
—s 


S= s,teR 
a. Show that S is a subspace of R4. 
b. Find two vectors that span S. 


c. Are the two vectors found in part (b) linearly 
independent? 


47. 


48. 


49. 


50. 


c. Are the three matrices that generate S linearly 
independent? 


Let A be a 2 x 3 matrix and let 


s= {xem |ax=[!]} 


Is S a subspace? Explain. 
Let A be an m x n matrix and let 
S= {xe R" | Ax=0} 
Is S a subspace? Explain. 
Let A be a fixed n x n matrix and let 
S = {B € Mnxn | AB = BA} 
Is S a subspace? Explain. 


Suppose S and T are subspaces of a vector 
space V. Define 


S+T={u+vilueS,veT} 
Show that S$ + T is a subspace of V. 


d. Is S = R*? 
51. Let S = span({u,, u2, ...U,,}) and 
45. Let T = span({vj, V2, ...Vn}) be subspaces of a 
=s vector space V. Show that 
S= s —5t s,teR 
3t +25 S+T =span({w,...Uy,V1,.--Vn}) 
a. Show that S is a subspace of R°. (See Exercise 50.) 
b. Find a set of vectors that span S. 52. Let 
c. Are the two vectors found in part (b) linearly _ { | x =x x,y zE R} 
independent? y z 
d. Is S = R?? and 
46. Let al a abceR} 
—a c 


NEMA EE E E 
a Gp Meal ay o 


a. Describe the subspace S. 
b. Is S= Məx2? 


deal 


a. Show that S and T are subspaces. 
b. Describe all matrices in S + T. 
(See Exercises 50 and 51.) 


3.3 > Basis and Dimension 


In Sec. 2.3 we introduced the notion of linear independence and its connection to 
the minimal sets that can be used to generate or span R”. In this section we explore 
this connection further and see how to determine whether a spanning set is minimal. 


3.3 Basis and Dimension 157 


This leads to the concept of a basis for an abstract vector space. As a first step, we 
generalize the concept of linear independence introduced in Sec. 2.3 to the abstract 
vector spaces. 


DEFINITION 1 Linear Independence and Linear Dependence The set of vectors S = 
{V|, Vo,---, Vm} in a vector space V is called linearly independent provided that 
the only solution to the equation 


civi + 02V2 Het Cnm = 0 


is the trivial solution cı = c2 = --- = Cm = Q. If the equation has a nontrivial solu- 
tion, then the set S is called linearly dependent. 


v = 0 vV = 2 V3 = 4 
=I 2 7 


and let W = span{v,, Vo, v3}. 


a. Show that v3 is a linear combination of vı and vo. 
b. Show that span{v;, v2} = W. 
c. Show that vı and vz are linearly independent. 


Solution a. To solve the vector equation 


1 0 —3 
C1 O } +c] 2 = 4 
—1 2 7 
we row-reduce the corresponding augmented matrix for the linear system to 
obtain 
1 O|} -3 1 Oj -3 
0 2 4 | — | 0 1 2 
-1 2 7 0 0 0 
The solution to the vector equation above is cj = —3 and c2 = 2, therefore 
v3 = —3v, + 2v2 


Notice that the vector v3 lies in the plane spanned by vı and v2, as shown 
in Fig. 1. 
Figure 1 b. From part (a) an element of W = {c1V1 + c2V2 + €3¥3 | C1, C2, c3 € R} can be 
written in the form 
C1V1 + C2V2 + €3V3 = C1 V1 + C2V2 + €3(—3V1 + 2v2) 
= (c1 — 3c3)v1 + (c2 + 2c3)V2 


158 Chapter 3. Vector Spaces 


and therefore, every vector in W is a linear combination of vı and v2. As a 
result, the vector v3 is not needed to generate W, so that span{v;, v2} = W. 

c. Since neither vector is a scalar multiple of the other, the vectors vı and v2 are 
linearly independent. 


In Example 1, we were able to reduce the number of linearly dependent vec- 
tors that span W to a linearly independent set of vectors which also spans W. We 
accomplished this by eliminating the vector v3 from the set, which, as we saw in the 
solution, is a linear combination of the vectors vı and v and hence does not affect 
the span. Theorem 8 gives a general description of the process. 


THEOREM 8 Let v1, ..., Vn be vectors in a vector space V, and let W = span{v),..., Vn}. If 


v, is a linear combination of v,,..., V,—1, then 
W = span{v\,..., Vn—1} 
Proof Ifv is in span{v;,..., V,—1}, then there are scalars c1, c2, ..., Cn—1 such 
that v = cyVy +--+ + Cn—1Vn—1. Then v = cyvy +--+ + Cy—1Vn—1 + OVy, so that v 
is also in span{v;,..., Vn}. Therefore, 
span{v),..., Vn—1} C span{v),..., Vn} 

Conversely, if v is in span{v,,...,V,}, then there are scalars c),..., Cn such 
that v = c) Vv) +--+ + CnVn. Also, since v, is a linear combination of vj, ..., Vn—1, 
there are scalars d,,...,d,_ , such that v, = divi +--+ d,_1V,_1. Then 


V=cC1yy +--+ + Cp—-1Vn—-1 + CnVn 
CLV] He + Cn—1Vn-1 + en (divi + +++ + dp—1Vn—-1) 
= (cı + Cndı)Vı rre (Cn-1 + Cndn—1)Vn-1 


so that v € span{vı,...,Vn—-1ı} and span{v),...,v,} C span{vı, ...,Vn-1} 
Therefore, 
W = span{v),..., Vn} = span{vj, ..., Vn—1} 


| EXAMPLE2 | Compare the column spaces of the matrices 


and B= 


= NOR Re 
aN NN = 
Ww = N eS 
e= OoOO 
= N = Re 
(oa n N E 


Solution 


THEOREM 9 


DEFINITION 2 


3.3 Basis and Dimension 159 


By using the methods presented in Chap. 2 it can be shown that the the column 
vectors of the matrix A are linearly independent. Since the column vectors of B 
consist of a set of five vectors in R4, by Theorem 3 of Sec. 2.3, the vectors are 
linearly dependent. In addition, the first four column vectors of B are the same as 
the linearly independent column vectors of A, hence by Theorem 5 of Sec. 2.3 the 
fifth column vector of B must be a linear combination of the other four vectors. 
Finally by Theorem 8, we know that col(A) = col(B). 


As a consequence of Theorem 8, a set of vectors {v1,..., Vn} such that V = 
span{v;,...,V,} is minimal, in the sense of the number of spanning vectors, when 
they are linearly independent. We also saw in Chap. 2 that when a vector in R” can 
be written as a linear combination of vectors from a linearly independent set, then the 
representation is unique. The same result holds for abstract vector spaces. 


If B = {v1, V2,..., Vm} is a linearly independent set of vectors in a vector space V, 
then every vector in span(B) can be written uniquely as a linearly combination of 
vectors from B. 


Motivated by these ideas, we now define what we mean by a basis of a vector 
space. 


Basis for a Vector Space A subset B of a vector space V is a basis for V 
provided that 


1. B is a linearly independent set of vectors in V 
2. span(B) = V 


As an example, the set of vectors 
S = {e1,...,€n} 


spans R” and is linearly independent, so that S is a basis for R”. This particular basis 
is called the standard basis for R”. In Example 3 we give a basis for R*, which is 
not the standard basis. 


Show that the set 


1 1 
B= Hil} i |. 1 
0 1 = 


is a basis for R?. 


160 


Chapter 3 Vector Spaces 


Solution 


First, to show that S spans R?, we must show that the equation 


1 1 0 a 
Cl 1 + c2 1 ar C2 1 = b 
0 1 —1 iG 


has a solution for every choice of a, b, and c in R. To solve this linear system, we 
reduce the corresponding augmented matrix 


l a olia 1 O0 0| 2a—b-c 
il i 1|b to 0 1 0} -a+b+c 
Oia ie 0 0 1 —a +b 
Therefore, cı = 2a — b — c, c2 = —a + b + c, and c3 = —a + b. For example, sup- 
1 
pose that v = | 2 |; then 
3 


a A23 3 
Gls 
G= i 


so that 
1 1 0 1 
—3; 1)+4) 1]+ 1) =] 2 
0 1 —1 3 
Since the linear system is consistent for all choices of a, b, and c, we know that 
span(B) = R?. 


To show that B is linearly independent, we compute the determinant of the 
matrix whose column vectors are the vectors of B, that is, 


iil 0 
i il 1 |= —1 
© i =i 


Since this determinant is nonzero, by Theorem 9 of Sec. 2.3 the set B is linearly 
independent. Therefore, B is a basis for R*. Alternatively, to show that B is linearly 
independent, notice from the reduced matrix above that 


il il 0 
il 1 
@ i =I 


is row equivalent to 7. Again by Theorem 9 of Sec. 2.3, B is linearly independent. 


As we have already illustrated in the examples above, bases for a vector space are 
not unique. For example, consider the standard basis B = {e1, e2, e3} for R3. Another 
basis for R? is given by B’ = {2e;, e2, e3}, where we have simply multiplied the first 
vector by 2. 


THEOREM 10 


Solution 


3.3 Basis and Dimension 161 


Theorem 10 generalizes this idea, showing that an infinite family of bases can be 
derived from a given basis for a vector space by scalar multiplication. 


Let B = {vj,..., Vn} be a basis for a vector space V and c a nonzero scalar. Then 
B. = {cV1, V2, ..., Vn} is a basis. 


Proof If vis an element of the vector space V, then since B is a basis there are 
scalars c,..., Cn such that v = civ) + c2V2 + -+ © + CnVn. But since c Æ 0, we can 
also write rs 

y= TOD + €2V2 +++ + CnVn 


so that v is a linear combination of the vectors in B.. Thus, span(B,) = V. To 
show that Be is linearly independent, consider the equation 


ci(CV1) + C2V2 + +++ + Cnn = 9 
By vector space axiom 9 we can write this as 
(cie) (V1) + C2V2 + +++ + Cnn = 0 


Now, since B is linearly independent, the only solution to the previous equation is 
the trivial solution 


cic =0 c2 =0 ae Ch = 0 


Since c #0, then cı = 0. Therefore, Be is linearly independent and hence is a 
basis. 


Let W be the subspace of M2x2 of matrices with trace equal to 0, and let 


ge 1 0 0 1 0 0 
A 0 =I i r omone 
Show that S is a basis for W. 


In Example 2 of Sec. 3.2 we showed that W is a subspace of M2x2. To show that 
span(S) = W, first recall that a matrix 


ee 


has trace 0 if and only if a+ d = 0, so that A has the form 


A | a b | 
@ =% 
Since for every such matrix 


[e -a |=] <u 


162 Chapter 3. Vector Spaces 


Solution 


then span(S) = W. We also know that S is linearly independent since the linear 


system 
1 0 01 ool moa 
alo aello 6 E 


is equivalent to 
C1 C3 = 0 0 
C3 SE] F 0 0 


which has only the trivial solution cı = cp = c3 = 0. Thus, S is a basis for W. 


Similar to the situation for R”, there is a natural set of matrices in My» that 
comprise a standard basis. Let e;; be the matrix with a one in the ij position, and zeros 
elsewhere. The set S = {ej | 1 < i < m, 1 <j < n} is the standard basis for Minn. 
For example, the standard basis for M2x2 is 


s{fo oplo ob ft ohie 1]} 


Determine whether 


e-{(2 aM Ea 


is a basis for Moyo. 
_ |a 
Tle @ 


be an arbitrary matrix in M>, . To see if A is in the span of S, we consider the 


equation 
13 i? 0 il moi 
ala 4 |S) 4 o | na 


The augmented matrix corresponding to this equation is given by 


Let 


l ll Ola 
3 2 iis 
2 1 Olic 
1 0 —4ļ|d 
After row-reducing, we obtain 
i =i 0 a 
5 1 = 310) sp (0 


0 -3 —a — 3b + 5c 
0 0|a+4b—-—7ce+d 


Observe that the above linear system is consistent only if a+ 4b — 7c +d = 0. 
Hence, B does not span M2x2, and therefore is not a basis. 


aa 2 


Solution 


3.3 Basis and Dimension 163 


Notice that in Example 5 the set B is linearly independent, but the three matrices 
do not span the set of all 2 x 2 matrices. We will see that the minimal number of 
matrices required to span M2x2 is four. 

Another vector space we have already considered is P„, the vector space of 
polynomials of degree less than or equal to n. The standard basis for P, is the set 


B = {1, x, x7) 


Indeed, if p(x) = aọ + ax + a2x? +--+ anx” is any polynomial in ?,,, then it is a 
linear combination of the vectors in B, so span(B) = P,,. To show that B is linearly 
independent, suppose that 

Co +H x Hox 424 Gx” = 0 


for all real numbers x. We can write this equation as 
Co + cix tox? foes H cnx” = 0 + 0x + 0x7 +- + Ox” 


Since two polynomials are identical if and only if the coefficients of like terms are 
equal, then cı = c2 = c3 = ++- = Cp = Q. 
Another basis for P2 is given in Example 6. 


Show that B = {x + 1, x — 1, x?} is a basis for Po. 
Let ax? + bx + c be an arbitrary polynomial in P2. To verify that B spans P2, we 
must show that scalars c1, c2, and c3 can be found such that 
cy(x +1) +ceo(% —1I)+ 3x7 =ax? +bx+c 
for every choice of a, b, and c. Collecting like terms on the left-hand side gives 
c3x? + (cy + c2)x + (c1 — C2) = ax? +bx +c 


Equating coefficients on both sides, we obtain c3 = a, cı + c2 = b and cı — cz = C. 
This linear system has the unique solution 


ca = 5(b+c) co = 4(b-c) Gc 
Therefore, span(B) = P2. To show linear independence, we consider the equation 
cix + 1) + cox — 1) + c3x° = 0 + Ox + 0x? 


Observe that the solution above holds for all choices of a,b, and c, so that cj = 
c2 = c3 = 0. Therefore, the set B is also linearly independent and hence is a basis. 


Another way of showing that the set B of Example 6 is a basis for Pz is to show 
that the polynomials of the standard basis can be written as linear combinations of 
the polynomials in B. Specifically, we have 


l=404D-4@-D x=h@4+14+40-b 


and x? is already in B. 


164 


Chapter 3 Vector Spaces 


THEOREM 11 


Dimension 


We have already seen in Theorem 3 of Sec. 2.3 that any set of m vectors from R”, 
with m > n, must necessarily be linearly dependent. Hence, any basis of R” contains 
at most n vectors. It can also be shown that any linearly independent set of m vectors, 
with m <n, does not span R”. For example, as we have already seen, two linearly 
independent vectors in R? span a plane. Hence, any basis of R” must contain exactly n 
vectors. The number n, an invariant of R”, is called the dimension of R”. Theorem 11 
shows that this holds for abstract vector spaces. 


If a vector space V has a basis with n vectors, then every basis has n vectors. 


Proof Let B = {v),v2,..., Vn} be a basis for V, and let T = {u;, u2, ... , Un} 
be a subset of V with m > n. We claim that T is linearly dependent. To establish 
this result, observe that since B is a basis, then every vector in T can be written 
as a linear combination of the vectors from B. That is, 


Uy = M1 V1 + A12V2 + +++ + AinVn 
Us = A21 V1 + A22V2 +--+ Aan Vn 


Un = Am1VI F Am2V2 ap ee umn Vn 
Now consider the equation 
Cy, + Coun +--+ + CmUm = 0 


Using the equations above, we can write this last equation in terms of the basis 
vectors. After collecting like terms, we obtain 


(cy dq + C21 +++ + Cm dm d)V1 
+ (cy d12 + C222 + +++ + Cm hin2)V2 


+ (GEST Hcn +e + Cm mn )Vn =0 
Since B is a basis, it is linearly independent, hence 


Crh + cz + +++ + Cmmi = 0 
ciiz + €2h22 + +++ + Cmdmn2 = 0 


Cy Mn + Codon ++ + Cm Amn = 0 


This last linear system is not square with n equations in the m variables c1,..., Cm. 
Since m > n, by Theorem 3 of Sec. 2.3 the linear system has a nontrivial solution, 
and hence T is linearly dependent. 


DEFINITION 3 


THEOREM 12 


3.3 Basis and Dimension 165 


Now, suppose that T = {u;, U2,...,U,,} is another basis for the vector space V. 
By the result we just established it must be the case that m < n. But by the same 
reasoning, the number of vectors in the basis B also cannot exceed the number of 
vectors in T, so n < m. Consequently n = m as desired. 


We can now give a definition for the dimension of an abstract vector space. 


Dimension of a Vector Space The dimension of the vector space V, denoted 
by dim(V), is the number of vectors in any basis of V. 


For example, since the standard bases for R”, M2x2, Mmxn, and P, are 


{e1, €2,..., €n} 
{e11, €12, €21, €22} 
fe; |l<i<m,i<j<n} 


2 
{Dae} 


respectively, we have 
dim(R”) = n dim(M2x2) = 4 dim(Mmxn) = mn dim(P,) =n + 1 


We call a vector space V finite dimensional if there exists a basis for V with 
a finite number of vectors. If such a basis does not exist, then V is called infinite 
dimensional. The trivial vector space V = {0} is considered finite dimensional, with 
dim(V) = 0, even though it does not have a basis. In this text our focus is on finite 
dimensional vector spaces, although infinite dimensional vector spaces arise naturally 
in many areas of science and mathematics. 

To determine whether a set of n vectors from a vector space of dimension n is 
or is not a basis, it is sufficient to verify either that the set spans the vector space or 
that the set is linearly independent. 


Suppose that V is a vector space with dim(V) =n. 
1. If S = {v1, V2,..., Vn} is linearly independent, then span(S) = V and S is a 
basis. 
2. If S = {v1, V2,..., Vn} and span(S) = V, then S is linearly independent and 
S is a basis. 


Proof (1) Suppose that S is linearly independent, and let v be any vector in V. If 
v is in S, then v is in span(S). Now suppose that v is not in S. As in the proof of 


166 Chapter 3 Vector Spaces 


Solution 


Theorem 11, the set {v, v1, V2, ..., Vn} is linearly dependent. Thus, there are scalars 
C1, .--, Cn, Cn+1, not all zero, such that 


C1V1 + C2V2 +++ + CnVn + Cni Y = 0 


Observe that cyi1; Æ 0, since if it were, then S would be linearly dependent, vio- 
lating the hypothesis that it is linearly independent. Solving for v gives 


cl C2 Cn 
v= V1 v2 Dene Vn 
Cn+1 Cn+1 Cn+1 


As v was chosen arbitrarily, every vector in V is in span(S) and therefore V = 
span(S). 

(2) (Proof by contradiction) Assume that S is linearly dependent. Then by 
Theorem 5 of Sec. 2.3 one of the vectors in S can be written as a linear combina- 
tion of the other vectors. We can eliminate this vector from S without changing the 
span. We continue this process until we arrive at a linearly independent spanning 
set with less than n elements. This contradicts the fact that the dimension of 
V isn. 


Determine whether 


1 0 
B= O |, || i |, || @ 
1 0 1 


is a basis for R?. 


Since dim(R*) = 3, the set B is a basis if it is linearly independent. Let 


1 1 0 
i—| © i @ 
1 0 1 


be the matrix whose column vectors are the vectors of B. The determinant of this 
matrix is 1, so that by Theorem 9 of Sec. 2.3 the set B is linearly independent and 
hence, by Theorem 12, is a basis. We can also show that B is a basis by showing 
that B spans R?. 


Finding a Basis 

In Sec. 3.2, we saw that the span of a nonempty set of vectors S = {v],..., Vm} is 
a subspace. We then ask whether S is a basis for this subspace (or a vector space). 
From Theorem 12, this is equivalent to determining whether S is linearly independent. 
When the vectors v},..., Vm are in R”, as in Example 7, form the matrix A with ith 
column vector equal to v;. By Theorem 2 of Sec. 1.2, if B is the row echelon matrix 
obtained from reducing A, then Ax = 0 if and only if Bx = 0. Now if the column 
vectors of A are linearly dependent, then there are scalars cj,..., Cm, not all 0, such 


3.3 Basis and Dimension 167 


that c1v1 +---+CmVm = 9. To express this in matrix form, let 


cl 


Cm 


Then Ac = 0 = Bc. Hence, the column vectors of A and B are both linearly dependent 
or linearly independent. Observe that the column vectors of B associated with the 
pivots are linearly independent since none of the vectors can be a linear combination 
of the column vectors that come before. Therefore, by the previous remarks, the 
corresponding column vectors of A are also linearly independent. By Theorem 12, 
these same column vectors form a basis for col(A). When choosing vectors for a basis 
of col(A), we must select the column vectors in A corresponding to the pivot column 
vectors of B, and not the pivot column vectors of B. For example, the row-reduced 
echelon form of the matrix 


1 0 1 1 0 1 
A=|0 0 0 is the matrix B=|0 1 1 
0 1 1 00 0 


However, the column spaces of A and B are different. In this case col(A) is the xz 
plane and col(B) is the xy plane with 


1 0 
col(A) = span{v), v2} = span 0 |, |] 0 
0 1 
and 
1 
col(B) = span{w;, w2} = span 0 |,| 1 
0 
respectively. See Fig. 2. 
A 


xz plane 


xy plane 


span{w), W2} 


Figure 2 


168 


Chapter 3 


Vector Spaces 


The details of these observations are made clearer by considering a specific 
example. Let 


1 1 2 2 3 
S = {V1, V2, V3, V4, V5} = ILORI ley biel 
1 2 3 
We begin by considering the equation 
1 1 2 2 3 0 
C1 1 +o] 0 | +03 1 + c4 1 + ¢5 1 = 0 
0 1 2 1 3 0 


To solve this system, we reduce the corresponding augmented matrix to reduced 
echelon form. That is, 


1 1 2 2 3/0 1 00 1 00 
1 O 1 1 41/0 reduces to 0 1 O 1 41/0 
0 1 2 1 3/0 00 1 O 1/0 


In the general solution, the variables c1, c2, and c3 are the dependent variables cor- 
responding to the leading ones in the reduced matrix, while c4 and c5 are free. Thus, 
the solution is given by 


S = {(=s,—s — t, -t,s,t) | s,t € R} 


Now to find a basis for span(S), we substitute these values into the original vector 
equation to obtain 


1 1 2 2 3 (0) 
=s | 1 | +6s-]| 0 |+6)|1j+s|1f|+:|1ļ]=0 
0 1 2 1 3 0 


We claim that each of the vectors corresponding to a free variable is a linear com- 
bination of the others. To establish the claim in this case, let s = 1 and tf = 0. The 
above vector equation now becomes 


1 1 2 0 
—/1});-] 0;+] 1]=] 0 
0 1 1 0 


that is, 
—V;-—Vo+w=090 sothat v4 = vı + v2 
Thus, v4 is a linear combination of vı and v2. Also, to see that v5 is a linear combi- 
nation of vj, V2, and v3, we let s = 0 and ż = 1. 
In light of Theorem 8 we eliminate v4 and vs from S to obtain S’ = {v1, V2, v3}. 
Observe that S’ is linearly independent since each of these vectors corresponds to a 
column with a leading 1. Thus, the equation 


1 1 2 0 
C1 1 +a] 0| + 1 =- 0 
0 1 2 0 


has only the trivial solution. 


EXAMPLE 8 


Solution 


THEOREM 13 


3.3 Basis and Dimension 169 


We summarize the procedure for finding a basis for the span of a set of vectors. 


Given a set S = {v1, V2, V3,.--, Vn} to find a basis for span(S): 
1. Form a matrix A whose column vectors are vj, V2, ..., Vp. 
2. Reduce A to row echelon form. 
3. The vectors from S$ that correspond to the columns of the reduced matrix with 
a leading 1 are a basis for span(S). 


In Example 8 we use the process described above to show how to obtain a basis 
from a spanning set. 


Let 
1 0 1 1 —1 
S= OMe ey i i, | 2 Il. 1 
1 1 P 1 —2 


Find a basis for the span of S. 


Start by constructing the matrix whose column vectors are the vectors in S. We 
reduce the matrix 


lOl i =i i @ i @ =2 
OL 2 1 to Qi i @ =i 
to it 2 i =2 000 1 1 


Observe that the columns with a leading 1 in the reduced matrix are columns 1, 2, 
and 4. Therefore, a basis B for span(S) is given by {v1, V2, v4}, that is, 


1 0 1 
i= ® i, il |, |) 2 
1 IL 1 


A set of vectors in a vector space that is not a basis can be expanded to a basis 
by using Theorem 13. 


Suppose that S = {v1, V2,..., Vn} is a linearly independent subset of a vector space 
V. If vis a vector in V that is not in span(S), then T = {v, v1, V2, . . . , Vn} is linearly 
independent. 


Proof To show that T is linearly independent, we consider the equation 
C1V1 + C2V2 +++ F CnVn + Cn41V = 0 


If cn+1 4 0, then we can solve for the vector v in terms of the vectors of S, contrary 
to the hypothesis that v is not in the span of S. Thus, c,;; = 0 and the starting 
equation is equivalent to 

civ) +02V2 +--+ + enV, = 90 


170 Chapter 3. Vector Spaces 


Since S is linearly independent, then 
c; =0 co = 0 voii Ch = 0 


Hence, T is linearly independent. 


An alternative method for expanding a set of vectors in R” to a basis is to add all 
of the standard basis vectors of R” to the set and then trim the resulting set to a basis. 
This technique is illustrated in Example 9. 


| EXAMPLE9 | Find a basis for R* that contains the vectors 


1 —1 
y= A and vV = f 
0 0 


Solution Notice that the set {v1, v2} is linearly independent. However, it cannot span R* 
since dim(R*) = 4. To find a basis, form the set S = {vj, V2, €1, €2, €3, e4}. Since 
span{e,, e2, e3, e4} = Rf, we know that span(S) = R. Now proceed as in 
Example 8 by reducing the matrix 


1 -1 1 0 0 0 
0 10100 
1 -1 00 1 0 
0 0 0 0 0 1 
to reduced row echelon form 
100 1 1 0 
0 1 0 1 0 0 
0 0 1 0 -1 0 
0000 1 


Observe that the pivot columns are 1, 2, 3, and 6. A basis is therefore given by the 
set of vectors {V1, V2, €1, ea}. 


The following useful corollary results from repeated application of Theorem 13. 


COROLLARY 1 Let S = {v1, V2,..., Vp} be a linearly independent set of vectors in an n-dimensional 
vector space V with r < n. Then S can be expanded to a basis for V. That is, there 
are vectors {V;+1,V;+2,---, Vn} so that {Vj,Vo,...,V-,Vr41,---, Vn} is a basis 
for V. 


3.3 Basis and Dimension 171 


Fact Summary 


Let V be a vector space with dim(V) = n. 


1. There are finite sets of vectors that span V. The set of vectors can be 
linearly independent or linearly dependent. A basis is a linearly independent 
set of vectors that spans V. 


. Every nontrivial vector space has infinitely many bases. 
. Every basis of V has n elements. 
. The standard basis for R” is the n vectors e1, €2,..., €n. 


. The standard basis for M2x2 consists of the four matrices 


[oo 


0 1 0 0 0 0 
0 0 1 0 0 1 
. The standard basis for P,„ is {1,x,x7,...,x”}. 
. If v, is a linear combination of v1, V2,..., V,—1, then 


span{v], V2,..., Vn—1} = Span{vj, V2, ..., Vn—1, Vn} 


. dim(R”) =n, dim(Mmxn) = mn, dim(P,) =n + 1 

. If a set B of n vectors of V is linearly independent, then B is a basis for V. 
. If the span of a set B of n vectors is V, then B is a basis for V. 

. Every linearly independent subset of V can be expanded to a basis for V. 


. If S is a set of vectors in R”, a basis can always be found for span(S) from 
the vectors of S. 


Exercise Set 3.3 O ü 


neee 7 explain why the set S is nok abasis 5. S = {x, x?, x? +2x, x? — x +1} V=P3 
for the vector space V. 10 0 1 
2 0 ek: al 
1. S= 1g) el V=R 
0 0 2 -3 
3 1 = 
[e Sh irse 


\ V=R In Exercises 7—12, show that S is basis for the vector 
space V. 


8 
-3 
0 offi] eet ar 
i aie a eee 
: 1 
4. =I 


-ffa 
= {2, x, x? + 2x? — 1}V = P3 3 


172 Chapter 3 Vector Spaces 


1 0 0 
9, S= —1 |,| —2 |, 2 
1 —3 —2 
V =R 
—1 2 1 
10. S = —1 |,| —1 |, 1 
0 —3 
V =R 
1 0 1 1 
esi SILL) 
0 1 1 0 
—] 2f] 0 1 
V = Mx 


12. S = {x? + 1, x +2, =x? +x} V = P2 
In Exercises 13—18, determine whether S is a basis 
for the vector space V. 


—1 1 1 
13. S= 21/,/ 0 },] 1 V =R 
1 1 1 
2 5 3 
14. S= -2 | 1l | V=R 
1 2 1 
2 p =] 
1 4 2 
eS ep E a e 
1 5 3 
V= Rt 
=i 1 2 
1 1 3 1 
16. S= le ct h oe all. 
1 = | 2 
v=R 
17. S={1, 2x7 +x +2, =x? +x} V = P> 


24 
1 0 
igel 


V = Mx2 


In Exercises 19—24, find a basis for the subspace S of 
the vector space V. Specify the dimension of S. 


s+2t 
19. S= —s +t stERSV=R 
t 


= a at+d 
a sa{[tyd] 


21. Let S be the subspace of V = M2x2 consisting of 
all 2 x 2 symmetric matrices. 


ad eR} V = Mao 


22. Let S be the subspace of V = M2x2 consisting of 
all 2 x 2 skew-symmetric matrices. 


23. S = {p(x) | p(0) = 0} V = P2 


24. S= {p(x) | pO) = 0, pl) = 0} V = Ps 


In Exercises 25—30, find a basis for the span(S) as a 
subspace of R°. 


2 2 1 
25. S = ON 0o], |2 
—1 a) 1 
=9 4 2 
26. S = to 
3 2 5 
2 0 =i 2 
27. S= =a a l =f. lal 3 
0 2 0 =i 
a9 1 3 1 
28. S = oli oll =3l 2 
2 3 = = 
2 0 2 4 
29. S = ee ne h =i ee 
0 2 2 4 
2 1 0 2 
30. S=} |2 |,| at 2 el Ss 
0 0 2 1 


In Exercises 31—36, find a basis for the vector space 
V that contains the given vectors. 


2 1 
31. S= —-1 ],} 0 V=R 
3 2 


32. 


33. 


34. 


35. 


36. 


37. 


38. 


39. 


—1 1 
S= 1],]/ 1 V =R 
3 1 
1 3 
= =l 1 _ p4 
S= 2 bla V=R 
4 2 
fete 
je TEE oie ee 
= 1 pi -1 P] 1 = 
—1 2 
—1 1 
S= 1|, |1 V =R 
3 1 
2 —1 
S= 2 |,| —1 V =R? 
—1 3 
Find a basis for the subspace of Mnxn consisting 


of all diagonal matrices. 


Show that if S = {v1, V2, ..., Vn} is a basis for the 
vector space V and c is a nonzero scalar, then 
S’ = {cV1, CV2,..., CVn} is also a basis for V. 


Show that if S = {v1, v2,..., Vn} is a basis for R” 
and A is ann x n invertible matrix, then 
S’ = {Av,, Avo, ..., AV,} is also a basis. 


40. 


41. 


42. 


43. 


44. 


3.4 Coordinates and Change of Basis 173 


Find a basis for the subspace 
S = {x € Rt | Ax = 0} of R* where 


3 3 1 3 
A= 1 0 1 1 
20 2 1 


Suppose that V is a vector space with 
dim(V) = n. Show that if H is a subspace of V 
and dim(H) = n, then H = V. 


Let S and T be the subspaces of P3 defined by 
S = {p(x) | pO) = 0} 


and 

T = {q(x) | qC) = 9} 
Find dim(S), dim(T), and dim(S N T). 
Let 


2s+t+3r 
3s —t+2r 
s+t+2r 


W = s,t,reR 


Find dim(W). 
Let S and T be the subspaces of R* defined by 


sS 
S= : teR 
= 0 S, 
0 
and 
(0) 
T= $ s,teR 
(0) 


Find dim(S), dim(T), and dim(S N T). 


3.4 > Coordinates and Change of Basis 


From our earliest experiences with Euclidean space we have used rectangular coordi- 
nates, (or xy coordinates), to specify the location of a point in the plane. Equivalently, 
these coordinates describe a vector in standard position which terminates at the point. 
Equipped with our knowledge of linear combinations, we now understand these xy 
coordinates to be the scalar multiples required to express the vector as a linear com- 


bination of the standard basis vectors e; and e2. For example, the vector v = 


2; 
3 > 


with xy coordinates (2,3), can be written as 


v = 2e] + 3e) 


174 


Chapter 3 Vector Spaces 


as shown in Fig. 1(a). This point (or vector) can also be specified relative to another 
pair of linearly independent vectors, describing an x’y’ coordinate system. For 


example, since 
2 _ s1 41 -1 
3) 2) 1 2 1 


the x’y’ coordinates of v are given by (3, $). See Fig. 1(b). 


(a) (b) 
Figure 1 


In this section we generalize this concept to abstract vector spaces. Let V be 
a vector space with basis B = {v1, V2,...,V,}. From Theorem 7 of Sec. 2.3, every 
vector v in V can be written uniquely as a linear combination of the vectors of B. 
That is, there are unique scalars c1, c2,..., Cn such that 
V = C1V1 + C2V2 +++ + Cn Vn 


It is tempting to associate the list of scalars {c1, c2, ..., Cn} with the coordinates of 
the vector v. However, changing the order of the basis vectors in B will change the 
order of the scalars. For example, the sets 


m= {Loli} 
v= tli} Lol} 


are both bases for R?. Then the list of scalars associated with the vector | : | is 


and 


{1, 2} relative to B but is {2, 1} relative to B’. To remove this ambiguity, we introduce 
the notion of an ordered basis. 


DEFINITION 1 Ordered Basis An ordered basis of a vector space V is a fixed sequence of 


linearly independent vectors that span V. 


3.4 Coordinates and Change of Basis 175 


DEFINITION 2 Coordinates Let B = {v1, vo,..., Vn} be an ordered basis for the vector space 
V. Let v be a vector in V, and let c1, c2, ..., Cn be the unique scalars such that 


V = C1 V1 + C2V2 +++ + CnVn 


Then c),C2,..., Cy are called the coordinates of v relative to B. In this case we 
write 
c1 
C2 
[vl =| . 
Cn 


and refer to the vector [v]g as the coordinate vector of v relative to B. 


In R” the coordinates of a vector relative to the standard basis B = {e1, €2,..., €n} 
are simply the components of the vector. Similarly, the coordinates of a poly- 
nomial p(x) = aọ + ax + a2x? +---+ a,x" in P, relative to the standard basis 
{1, x, PAd x”} are the coefficients of the polynomial. 


| EXAMPLE1 | Let V = R? and B be the ordered basis 
1 —1 
s= (| a | 
Find the coordinates of the vector v = | : | relative to B. 


Solution The coordinates cı and cz are found by writing v as a linear combination of the 
two vectors in B. That is, we solve the equation 


an lee 


In this case cj = 3 and cp = 2. We therefore have that the coordinate vector of 


y= | : | relative to B is 
3 
[v]; = | 2 | 


| EXAMPLE2 | Let V = P> and B be the ordered basis 


B= {1,x-1,(*-1)’} 


176 Chapter 3. Vector Spaces 


Find the coordinates of p(x) = 2x? — 2x + 1 relative to B. 


Solution We must find c1, c2, and c3 such that 
ci(1) + co(x — 1) + c3(x — 1)? = 2x? — 2x + 1 
Expanding the left-hand side and collecting like terms give 
c3x? + (c2 — 2c3)x + (cy — c2 + c3) = De See 
Equating the coefficients of like terms gives the linear system 
cCi— c2 + cg= 1 
c2 — 2c3 = —2 
js 2 
The unique solution to this system is cı = 1, c2 = 2, and c3 = 2, so that 


1 
[vl = | 2 
2 


| EXAMPLE3 | Let W be the subspace of all symmetric matrices in the vector space M2x2. Let 
pa 1 0 0 1 0 0 
E OW | i Ol | @ 
Show that B is a basis for W and find the coordinates of 
2S 
vapi 
relative to B. 


Solution In Example 8 of Sec. 3.2, we showed that B spans W. The matrices in B are also 
linearly independent and hence are a basis for W. Observe that v can be written as 


1 0 0 1 0 0 2 
a aleli oa AE j 
Then relative to the ordered basis B, the coordinate vector of v is 


2 
[vls = | 3 
5 


3.4 Coordinates and Change of Basis 177 


Change of Basis 


Many problems in applied mathematics are made easier by changing from one basis of 
a vector space to another. To simplify our explanation of the process, we will consider 
a vector space V with dim(V) = 2 and show how to change from coordinates relative 
to one basis for V to another basis for V. 

Let V be a vector space of dimension 2 and let 


B = {vi, v2} and B’ = {vj, v5} 


be ordered bases for V. Now let v be a vector in V, and suppose that the coordinates 
of v relative to B are given by 


x ? 
[v]g = | ya that is V = XV + X2V2 
2 


To determine the coordinates of v relative to B’, we first write vı and v2 in terms of 
the vectors vi and V5. Since B’ is a basis, there are scalars a1, a2, bı, and b2 such 
that 


y= ay) + av) 
Vo = bivi + bovs 
Then v can be written as 
v = xı (ay Vi + a2V5) + x2 (bı v + bov5) 
Collecting the coefficients of v} and v/ gives 
v = (xa) + x2b1)v, + (x1a2 + x2b2)v5 


so that the coordinates of v relative to the basis B’ are given by 


xai + X2b, 
xia? + x2b2 


[v]; = | 
Now by rewriting the vector on the right-hand side as a matrix product, we have 
m= plal- z |r 
Notice that the column vectors of the matrix are the coordinate vectors [vı]; and 


[v2]g;. The matrix 
a, bi 
a b 


is called the transition matrix from B to B’ and is denoted by [I ie so that 


[vls = (13 Ivle 


178 


Chapter 3 Vector Spaces 


Let V = R? with bases 


{i}[4]) = 


a. Find the transition matrix from B to B’. 


b. Let [v] = | E | and find [v],’. 


Solution a. By denoting the vectors in B by vı and v2 and those in B’ by vj and v, the 
column vectors of the transition matrix are [v;],’ and [v2]. These coordinate 
vectors are found from the equations 


“|i ]+=|7]=[5] "302 ea 


Solving these equations gives cı = 2 and c2 = 3, and dı = 0 and d} = —1, so 


that 5 5 
[vily = | 3 | and [vole = | = | 


Therefore, the transition matrix is 
, 2 (0) 
BIS 
[Z ] BAS | 3 =I | 


[vls = 12 [v]s 


m-f 2 


Observe that the same vector, relative to the different bases, is obtained from 
the coordinates [v]g and [v]g’. That is, 


LTC aol 


The procedure to find the transition matrix between two bases of a vector space 
of dimension 2 can be generalized to IR” and other vector spaces of finite dimension. 
The result is stated in Theorem 14. 


b. Since 


then 


THEOREM 14 Let V be a vector space of dimension n with ordered bases 
B = {v1,V2,...,Vy} and Bl = {V V2,- Vy) 


Then the transition matrix from B to B’ is given by 


B’ 
lz = vi v2 <| Vn 


B’ B’ B’ 


3.4 Coordinates and Change of Basis 179 


Moreover, a change of coordinates is carried out by 


[vle = (18 [v]s 


In Example 5 we use the result of Theorem 14 to change from one basis of P2 
to another. 


B ={1,x, x7} and B= (1,241 eee 


a. Find the transition matrix [7 ies 
b. Let px~) =3-—x+ 2x? and find [p(x) |p’. 


Solution a. To find the first column vector of the transition matrix, we must find scalars 
ai, a2, and a3 such that 


a(l) +(x +) +a? +x+1)=1 


By inspection we see that the solution is a} = 1, a2 = 0, and a3 = 0. Therefore, 


1 
[1a = | 0 
0 
The second and third column vectors of the transition matrix can be found by 
solving the equations 


bi) +b% +1) +b +x+1)=x 


and 
c(l) + c2(x + 1) + c3(x? +x + 1) = x? 
respectively. The solutions are given by bı = —1, b2 = 1, and b3 = 0, and 
cı = 0, c2 = —1, and c3 = 1. Hence, the transition matrix is 
1 —1 0 
Ue = | 0 a =i 
0 0 1 


b. The basis B is the standard basis for P2, so the coordinate vector of 
p(x) = 3 — x + 2x? relative to B is given by 
3 


[p@)]ae = | -1 
2 


180 


Chapter 3 Vector Spaces 


Hence, 
1 -l 0 3 4 
pol =] 0 1 -1 |] -1}=] -3 
0 0 1 2 2 


Notice that 3 — x + 2x? = 4(1) —3(@@ +1) +2(x? +x + 1). 


| EXAMPLE6 | Let B = {e;, e2} be the standard ordered basis for R?, B’ be the ordered basis 


given by 


B= Wie {| i 


HD 


and let v = | } 


a. 
b. 


Find the transition matrix from B to B’. 
Find [v]z’. 


c. Write the vector v as a linear combination of e) and e2 and also as a linear 


Solution a. 


b. Since B is the standard basis, the coordinates 


combination of v| and v3. 


. Show the results of part (c) graphically. 


The transition matrix from B to B’ is computed by solving the equations 


-1 1 1 —1 1 0 
afi feeli l[o] eam 
That is, we must solve the linear systems 

-c Fora -di +d =0 
and 
Gre =0 dj+d, =1 
The unique solutions are given by cı = —5, c = 5 and d) = $, d = 5. The 


transition matrix is then given by 


ae = 


NI NI 
es 


of v relative to B are [v] = 


i . By Theorem 14, the coordinates of v relative to B’ are given by 
i il 1 
|) =a 3 | ee 
DE 2 


3.4 Coordinates and Change of Basis 181 


c. Using the coordinates of v relative to the two bases, we have 


1 0 | 
| 9] +4| 1 | =V = ee 
d. The picture given in Fig. 2 shows the location of the terminal point (3, 4) of 
the vector v relative to the eje2 axes and the v/v axes. 


Figure 2 


Inverse of a Transition Matrix 


A fact that will be useful in Chap. 4 is that the transition matrix [Z ie between bases 
B and B’ of a finite dimensional vector space is invertible. Moreover, the transition 
matrix from B’ to B is the inverse of [7 le. To see this, suppose that V is a vector 
space of dimension n with ordered bases 


B = {vj,V2,.--, Vn} and B’ = {vi,V5,---, Vy} 


To show that [J Ka is invertible, let x € R” be such that 
ue x =0 


Observe that the left-hand side of this equation in vector form is x;[vj]g7 +--+ 
Xn[Vn]g’. Since B is a basis, then the vectors v; for 1 < i < n are linearly independent. 
Hence, so are the vectors [v;]z’, ---, [Vn ]g’. Therefore, xı = x2 = --- = x, = 0. Since 
the only solution to the homogeneous equation [J Bx = 0 is the trivial solution, then 
by Theorem 17 of Sec. 1.6, the matrix [J J8 is invertible. Moreover, by Theorem 14, 
since 


[vle = H] [vls we know that  (W15)~'iv]e = [vl 


and therefore 
g = (Ug)! 


The previous observations are summarized in Theorem 15. 


182 


Exerc 


Chapter 3 Vector Spaces 


THEOREM 15 Let V be a vector space of dimension n with ordered bases 
B = {v1,V2,.--, Vn} and B' = {V}, Von. Va} 
Then the transition matrix [J iP from B to B’ is invertible and 


ue = 2)! 


Fact Summary 


Let V be a vector space with dim(V) = n. 


1. In R”, the coordinates of a vector with respect to the standard basis are the 
components of the vector. 

2. Given any two ordered bases for V, a transition matrix can be used to 
change the coordinates of a vector relative to one basis to the coordinates 
relative to the other basis. 

. If B and B’ are two ordered bases for V, the transition matrix from B to B’ 
is the matrix [J E whose column vectors are the coordinates of the basis 
vectors of B relative to the basis B’. Also 


[vle = 12 [ve 


. If B and B’ are two ordered bases for V, the transition matrix from B to B’ 
is invertible and the inverse matrix is the transition matrix from B’ to B. 
That is, (ë pls Va 


ise Set 3.4 js 


In Exercises 1—8, find the coordinates of the vector v 


relative t 


1. B= 


o the ordered basis B. v=|—-1 


Care 
| 


——s 
reo 

| 
BRN 

| 
| ens | 

| 
ee 

E=] 

n 
< 
ll 

| 
-= N 

E= 

A 

les) 

ll 
me NN 
N ee 
= oO 


3.4 Coordinates and Change of Basis 183 


5. B = {1,x — 1, x?} v= p(x) = —2x7 +2x + 3 1 0 1 -l 
ee Gam | eto al 3 ee 
6. B ={x? +2x4+2,2x4+3,-x7+x4+1} 
v= p(x) = —3x? + 6x + 8 0 1 2 0 
1 0f] -1 1 


~5={10 opli of ne {| 


8. B= Ds 0 1l In Exercises 13—18, find the transition matrix between 
0 0 2? the ordered bases B, and Bp; then given [v],,, find 
[i oplos] a 
1 0 0 3 13. B, ={ | j | — \ 
a2 | 2 -2 | 
l 3 B2 = o | in =| 5 | 
In Exercises 9—12, find the coordinates of the vector v 


relative to the two ordered bases B; and B2. 4 


(HE) 
| 
maf 
| 


11. By = {x° —x +1,x7+%4+1,2x7} 
By = {2x7 + 1, —x? +x +2, x +3} 


n=] 
v= p(x) =x74+x43 


184 


17. 


18. 


19. 


20. 


Chapter 3 Vector Spaces 


2 
[vla = | 1 
1 


Bı = {1, x, x°} By = {x°, 1, x}, 
2 

[v]; = | 3 
5 


Bı = {x? — 1,2x? +x +1, —x + 1} 
By = {(x — 1}, x +2, (x + 1)7} 
1 


[v]z, a 1 
2 


Let B = {v1, V2, v3} be the ordered basis of R? 
consisting of 


— 1 
vV = vV = 0 
1 1 
—1 
V3 = 1 
0 
Find the coordinates of the vector 
a 
Y= b 
Cc 


relative to the ordered basis B. 


Let B = {v1, V2, V3, V4} be the basis of R* 
consisting of 


1 0 
Y= p V = Sl 
1 1 
0 —1 
-l 
V3 = e V4 = , 
—1 0 
0 —1 


21. 


22. 


Find the coordinates of the vector 


relative to the ordered basis B. 


Let 
1 0 0 
B; = 0J, | 1 4, 
0 0 1 


be the standard ordered basis for R? and let 


0 1 0 
B = |.) 0-1, 
0 0 1 


be a second ordered basis. 


a. Find the transition matrix from the ordered 
basis B, to the ordered basis B2. 


b. Find the coordinates of the vector 


Let 


and 


be two ordered bases for R?. 
a. Find [1] 5° 


b. Find [7]! 


-1 
c. Show that ([/Js) = U15! 


23. 


24. 


Let 
1 0 
s={fo}[1]} 

be the standard ordered basis for R? and let 

1 

1 =a 

2 
be a second ordered basis. 
a. Find [/]% 
b. Find the coordinates of 


Bint 


relative to the ordered basis B. 


‘| 


c. Draw the rectangle in the plane with vertices 
(1, 2), (1, 4), (4, 2), and (4, 4). 


d. Draw the polygon in the plane with vertices 
given by the coordinates found in part (b). 


Fix a real number 0 and define the transition 


matrix from the standard ordered basis S$ on R? to 


25. 


3.5 Application: Differential Equations 185 


a second ordered basis B by 


cos 8 


sin 0 cos 0 


u? = | Pe 


a. If [v]s = | ; } then find [v],. 


b. Draw the rectangle in the plane with vertices 


0 0 1 1 

0 1 0 1 
c. Let 0 = 4. Draw the rectangle in the plane 
with vertices the coordinates of the vectors, 


given in part (b), relative to the ordered 
basis B. 


Suppose that Bı = {u;, u2, U3} and 

Bo = {V1, V2, V3} are ordered bases for a vector 
space V such that u; = —v; + 2v2, U2 = 

—v, + 2v2 — v3, and uy = —v2 + V3. 


a. Find the transition matrix [I i 


b. Find [2u; — 3u + us]p, 


3.5 > Application: Differential Equations 


Differential equations arise naturally in virtually every branch of science and tech- 
nology. They are used extensively by scientists and engineers to solve problems 
concerning growth, motion, vibrations, forces, or any problem involving the rates 
of change of variable quantities. Not surprisingly, mathematicians have devoted a 
great deal of effort to developing methods for solving differential equations. As it 
turns out, linear algebra is highly useful to these efforts. However, linear algebra also 
makes it possible to attain a deeper understanding of the theoretical foundations of 
these equations and their solutions. In this section and in Sec. 5.3 we give a brief 
introduction to the connection between linear algebra and differential equations. 

As a first step, let y be a function of a single variable x. An equation that 
involves x, y, y’, y",..., y, where n is a fixed positive integer, is called an ordinary 
differential equation of order n. We will henceforth drop the qualifier ordinary since 
none of the equations we investigate will involve partial derivatives. Also, for obvious 
reasons we will narrow the scope of our discussion and consider only equations of a 


certain type. 


186 


Chapter 3 Vector Spaces 


The Exponential Model 


One of the simplest kinds of differential equations is the first-order equation given by 
y sky 

where k is a real number. This equation is used to model quantities that exhibit 

exponential growth or decay and is based on the assumption that the rate of change 

of the quantity present at any time f¢ is directly proportional to the quantity present at 

time ¢. A solution to a differential equation is a function y = f(t) that satisfies the 


equation, that is, results in an identity when substituted for y in the original equation. 
To solve this equation, we write it as 


=k 
y 
and integrate both sides of the equation with respect to the independent variable to 


obtain ; 
my= [Żdr= fiaska 
y 


y= elny = ekttA = ef eft = Ce 


Solving for y gives 


where C is an arbitrary constant. 

As an illustration, consider the differential equation y’ = 3y. Then any function of 
the form y(t) = Ce” is a solution. Since the parameter C in the solution is arbitrary, 
the solution produces a family of functions all of which satisfy the differential equation. 
For this reason y(t) = Ce% is called the general solution to y’ = 3y. 

In certain cases a physical constraint imposes a condition on the solution that 
allows for the identification of a particular solution. If, for example, in the previous 
problem it is required that y = 2 when t = 0, then 2 = Ce? , so that C = 2. This is 
called an initial condition. A differential equation together with an initial condition 
is called an initial-value problem. The solution to the previous initial-value problem 
is given by 

y(t) = 2e™ 

From a linear algebra perspective we can think of the general solution to the 
differential equation y’ = ky as the span, over R, of the vector e“ which describes a 
one-dimensional subspace of the vector space of differentiable functions on the real 
line. 


Second-Order Differential Equations with Constant 
Coefficients 


We now extend the differential equation of the previous subsection to second-order 
and consider equations of the form 


y" tay’ +by =0 


Motivated by the solution to the exponential model, we check to see if there are any 
solutions of the form y = e’*, for some real number r. After computing the first and 


Solution 


Solution 


3.5 Application: Differential Equations 187 


second derivatives y’ = re’™* and y” = r7e’*, we see that y = e’” is a solution of the 
second-order equation if and only if 


re’ + are’ + be =0 

that is, 
e*(r? +ar +b) =0 
Since e’* > 0 for every choice of r and x, we know e” is a solution of y” + ay’ + 
by = 0 if and only if 
r?+ar+b=0 

This equation is called the auxiliary equation. As this equation is quadratic there 
are three possibilities for the roots rı and r2. This in turn yields three possible vari- 
ations for the solution of the differential equation. The auxiliary equation can have 


two distinct real roots, one real root, or two distinct complex roots. These cases are 
considered in order. 


Case 1 The roots rı and rz are real and distinct. 
In this case there are two solutions, given by 


yi(x) = ge and y(x) = e” 


Find two distinct solutions to the differential equation y” — 3y’+ 2y = 0. 


Let y =e’*. Since the auxiliary equation r?—3r+2= (r—1)(r—2)= 0 has 
the distinct real roots r; = 1 and r2 = 2, two distinct solutions for the differential 
equation are 

marse and y(x) = e* 


Case 2 There is one repeated root r. Although the auxiliary equation has only one 
root, there are still two distinct solutions, given by 


yi (x) = e™ and y(x) = xe™ 


Find two distinct solutions to the differential equation y” — 2y’+ y = 0. 


Let y = e"™*. Since the auxiliary equation r? — 2r + 1 = (r — 1) = 0 has the 
repeated root r = 1, two distinct solutions of the differential equation are 


MES and y2(x) = xe” 


188 Chapter 3 Vector Spaces 


Solution 


THEOREM 16 


THEOREM 17 


Case 3 The auxiliary equation has distinct complex (conjugate) roots given by 
rı = a+ i and r2 = a — Bi. In this case the solutions are 


yi(x) = e cos Bx and y2(x) = e™ sin Bx 


Find two distinct solutions to the differential equation y” — 2y' + 5y = 0. 


Let y = e’*, so the auxiliary equation corresponding to y” — 2y’ + 5y = 0 is given 
by r? — 2r + 5 = 0. Applying the quadratic formula gives the complex roots rı = 
1 + 27 and r2 = 1 — 2i. The two solutions to the differential equation are then given 
by 

yi (x) = e* cos 2x and yo(x) = e* sin 2x 


In what follows we require Theorem 16 on existence and uniqueness for second- 
order linear differential equations. A proof can be found in any text on ordinary 
differential equations. 


Let p(x), q(x), and f(x) be continuous functions on the interval 7. If xo is in Z, 
then the initial-value problem 


y+ px)y’+q@)y=f) yRo=y yo) = y 


has a unique solution on 7. 


Fundamental Sets of Solutions 


With solutions in hand for each one of these cases, we now consider the question 
as to whether there are other solutions to equations of this type, and if so, how they 
can be described. The simple (but elegant) answer, to which the remainder of this 
section is devoted, is found by using linear algebra. We will see that in each case 
the functions y;(x) and y2(x) form a basis for the vector space of solutions to the 
equation y” + ay’ + by = 0. Accordingly, every solution y(x) to this equation can be 
written as a linear combination y(x) = cy yı (x) + coya(x). 

Toward this end, for a positive integer n > 0, let V = C (J) be the vector space 
of all functions that are n times differentiable on the real interval J. If n = 0, then 
C (I) denotes the set of all continuous functions on 7. We first show that the solution 
set to the differential equation y” + ay’ + by = 0 is a subspace of V = C® (J). 


Superposition Principle Suppose that yı(x) and y2(x) are functions in 
C(I). If y;(x) and y2(x) are solutions to the differential equation y” + ay’ + 
by = 0 and c is any scalar, then y;(x) + cy2(x) is also a solution. 


THEOREM 18 


3.5 Application: Differential Equations 189 


Proof Since y;(x) and y2(x) are both solutions, then 
yi (x) + ay} (x) + by (x) =0 and ys (x) + ays(x) + by2(x) = 0 


Now to show that y(x) = yı (x) + cyo(x) is a solution to the differential equation, 
observe that 


ya) = yi(x) eya) and y") = y œ) + ey (x) 


Substituting the values for y, y’, and y” in the differential equation and rearranging 
the terms gives 


yi) + eyz (x) + aly) (x) + cy, Œ)] + biyi Œ) + cy2(x)] 
= yi (x) + cys (x) + ayi (x) + acy (x) + byi (x) + bey2(x) 
= [y1 Œ) + ayi (x) + by: («)] + cly (x) + ayz (x) + by2(x)] 
=0+0=0 


Let S be the set of solutions to the differential equation y” + ay’ + by = 0. By 
the superposition principle above and by Theorem 4 of Sec. 3.2, we know that S is a 
subspace of C Acq), 

To analyze the algebraic structure of S, we recall from Exercise 31 of Sec. 2.3 
that a set of functions U = {fi (x), fo(x),..-, fa(x)} is linearly independent on an 
interval J if and only if 


cfi) + efx) + +++ + Cn fn (x) = 0 


for all x € Z implies that cı = c2 =--- = cy, = 0. Theorem 18 provides a useful test 
to decide whether two functions are linearly independent on an interval. 


Wronskian Let f(x) and g(x) be differentiable functions on an interval 7. 
Define the function W[f, g] on J by 


fx) ga) 
f(x) gx) 
If W[f, g](xo) is nonzero for some xp in J, then f(x) and g(x) are linearly inde- 
pendent on /. 


Wf, IQ) = = fg E) — f’@)g@) 


Proof Consider the equation 

cf (x) + e2g(x) =0 
Taking derivatives of both sides, we obtain 

ci f(x) + eog (ax) =0 


Taken together, these equations form a linear system of two equations in the two 
variables cı and c2. Observe that the determinant of the corresponding coefficient 
matrix is W Lf g] (x). Hence, if W[f, g](x) is nonzero for some xg € J, then by 


190 


Chapter 3 Vector Spaces 


THEOREM 19 


THEOREM 20 


Theorem 17 of Sec. 1.6 we know that cı = c2 = 0. Accordingly, f(x) and g(x) 
are linearly independent. 


The function W[f, g] of Theorem 18 is called the Wronskian of f and g. The 
Wronskian, and the result of Theorem 18, can be extended to any finite set of functions 
that have continuous derivatives up to order n. 

If yı and yz are solutions to the differential equation y” + ay’ + by = 0, then 
Abel’s formula for the Wronskian gives the next result. 


Let y,(x) and y2(x) be solutions to the differential equation y” + ay’ + by = 0. 
The functions yı and yp are linearly independent if and only if W[y1, yo](x) 40 
for all x in J. 


At this point we are now ready to show that any two linearly independent solutions 
to the differential equation y” + ay’ + by = 0 span the subspace of solutions. 


Fundamental Set of Solutions Suppose that y;(x) and y2(x) are two linearly 
independent solutions, on the interval Z, to the differential equation 


y” + ay’ + by =0 
Then every solution can be written as a linear combination of y;(x) and y2(x). 
Proof Let y(x) be a particular solution to the initial-value problem 
y" +ay' +by=0 with = y(%o)=yo and —y'(%o) = y 
for some xo in J. We claim that there exist real numbers cı and c2 such that 
y(x) = cryi (x) + c2y2(x) 
Differentiating both sides of this equation gives 
y (x) = ery (x) + 273%) 


Now substituting xq into both of these equations and using the initial conditions 
above, we obtain the linear system of two equations in the two variables cı and c2 
given by 

c1yı (x0) + c2y2(%o) = Yo 

c1yi (xo) + c2y5(%0) = y6 
Observe that the determinant of the coefficient matrix is the Wronskian 
W[y1, y2](xo). Since yı (x) and y2(x) are linearly independent, then by Theorem 19, 
the determinant of the coefficient matrix is nonzero. Consequently, by Theorem 17 


3.5 Application: Differential Equations 191 


of Sec. 1.6, there exist unique numbers cı and cz that provide a solution for the 
linear system. Define the function g by 


B(x) = cy yi (x) + c2y2(x) 


Then g(x) is also a solution to the original initial-value problem. By the uniqueness 
part of Theorem 16, 

y(X) = g(x) = cry) + cr y2(x) 
as claimed. 


The linearly independent solutions yı(x) and y2(x) of Theorem 20 are called 
a fundamental set of solutions. In light of this theorem, the fundamental set 
{y1 (x), y2(x)} is a basis for the subspace S of solutions to y” + ay’ + by =0. As 
there are two of them, dim(S) = 2. 

We now return to the specific cases for the solutions to y” + ay’ + by = 0. Recall 
that for case 1 we obtained the two solutions 


yy (x) = e”!* and y(x) = e” 


with rı Æ r2. To show that these functions form a fundamental set, we compute the 
Wronskian, so that 


W[yı, y2](x) = rye" roe"2* 


=a ro(e'!* e"2*) =, rı (e"!* e"2*) 


= ret) _ p elitra)* 


= et) (r3 — r1) 


Since the exponential function is always greater than 0 and rı and rp are distinct, the 
Wronskian is nonzero for all x, and therefore the functions are linearly independent. 
Hence, {e"!*, e’2*} is a fundamental set, and every solution y(x) to a problem of this 
type has the form 
y(x) = cre" + cze” 
for scalars cı and c2. 
For case 2, the Wronskian is given by 


Wie, xe"*] = eux 


Since e?"* 


this type. 
Finally, for case 3 the Wronskian is given by 


is never zero, {e’*, xe’*} is a fundamental set of solutions for problems of 


Wle cos Bx, e% sin Bx] = pe? 


so that {e™ cos Bx, e** sin Bx} is a fundamental set as long as 6 is not zero. If B = 0, 
then the differential equation becomes y” + ay’ = 0 which reduces to case 1. 

There are many physical applications of second-order differential equations with 
constant coefficients. Two important areas are in mechanical and electrical oscillations. 


192 Chapter 3 Vector Spaces 


Solution 


A fundamental problem in mechanics is the motion of an object on a vibrating spring. 
The motion of the object is described by the solution of an initial-value problem of 
the form 


my" +cy'+tky= fœ) yO)=A yO=B 
where m is the mass of the object attached to the spring, c is the damping coefficient, 


k is the stiffness of the spring, and f(x) represents some external force. If there are 
no external forces acting on the system, then f(x) = 0. 


Let the mass of an object attached to a spring be m = 1, and the spring constant 
k = 4. Solve the three initial-value problems describing the position of the object 
attached to the spring with no external forces; initial conditions y(0) = 2, y'(0) = 0; 
and damping coefficients c equaling 2, 4, and 5. 
The differential equation describing the position of the object is given by 

y+ cy’ +4y=0 
When c = 2, the auxiliary equation for y” + 2y’ + 4y = 0 is 

r’ +2r+4=0 


Since the roots are the complex values rı = —1 + ~v3i and r = —1 — v3i, the 
general solution for the differential equation is 


y(x) =e * ler cos(V73x) + c2 sin(V3x)| 


From the initial conditions, we have 
3 
y(x) = 2e* soso cE A ino) 


When c = 4, the auxiliary equation for y” + 4y’ + 4y = 0 is 
r +4r+4=(r +27 =0 


Since there is one repeated real root, the general solution for the differential 
equation is 
y(x) = ce ™ + cxe ”™ 


From the initial conditions, 
y(x) = 2e7™ (2x + 1) 
When c = 5, the auxiliary equation for y” + 5y’ + 4y = 0 is 
r’4+5rt+4=(r+)7r+4 =0 
Since there are two distinct real roots, the general solution for the differential 
equation 1s 


y(x) = ce ~ + oe 


3.5 Application: Differential Equations 193 


From the initial conditions, 
y(x) = 3(4e* — e) 


The graphs of the solutions are shown in Fig. 1. 


Figure 1 


Exercise Set 3.5 


a. Find the general solution to the associated 
homogeneous differential equation for which 
g(x) = 0. This is called the complementary 
solution and is denoted by ye(x). 


In Exercises 1—4, find the general solution to the 
differential equation. 


a. Find two distinct solutions to the homogeneous dif- 
ferential equation. 

b. Show that the two solutions from part (a) are lin- b. Assume there exists a particular solution 
early independent. denoted y,(x) to the nonhomogeneous 


. Write the general solution. equation of the form 


c 
1. y” —5y'+6y =0 yp(x) =ax*+ bx +c 
2. y"+3y’+2y=0 
3. y’+4y’+4y =0 
4, y" —4y’+5y =0 
In Exercises 5 and 6, find the solution to the 
initial-value problem. 


Substitute y,(x) into the differential equation 
to find conditions on the coefficients a, b, 
and c. 


c. Verify that y(x) = ye(x) + yp(x) is a solution 
to the differential equation. 


8. Consider the nonhomogeneous differential 


5. y” —2y'+y =0 y(0) =1 yO) =3 


6. 
7. 


y” —3y'+2y=0 yl) =0 y'(1)=1 
Consider the the nonhomogeneous differential 
equation given by 
y” — 4y' + 3y = g(x) where 
g(x) = 3x? +x+2 


equation given by 


y” + 4y +3y = g(x) where 


g(x) = 3sin2x 
a. Find the general solution to the associated 


homogeneous differential equation for which 
g(x) = 0. 


194 Chapter 3 Vector Spaces 


b. Assume there exists a particular solution to the 
nonhomogeneous equation of the form 


Yp(x) = Acos 2x + B sin2x 


Substitute y,(x) into the differential equation 
to find conditions on the coefficients A and B. 

c. Verify that y.(x) + yp(x) is a solution to the 
differential equation. 


9. Let w be the weight of an object attached to a 
spring, g the constant acceleration due to gravity 
of 32 ft/s”, k the spring constant, and d the 
distance in feet that the spring is stretched by the 


weight. Then the mass of the object is m = 7 and 
k = Ẹ. Suppose that a 2-lb weight stretches a 
spring by 6-in. Find the equation of the motion of 
the weight if the object is pulled down by 3-in 
and then released. Notice that this system is 
undamped; that is, the damping coefficient is 0. 


10. Suppose an 8-Ib object is attached to a spring 


with a spring constant of 4 lb/ft and that the 
damping force on the system is twice the velocity. 
Find the equation of the motion if the object is 
pulled down 1-ft and given an upward velocity 

of 2 ft/s. 


Review Exercises for Chapter 3 


1. Determine for which values of k the vectors 


4 cl Van We 


T E 9 


form a basis for R4. 


2. For which values of a, b, c,d, e, and f are the 
vectors 


a b d 
0 Cc e 
0 0 f 


a basis for R?? 


= a—b a 

= b+c a-c 
a. Show that S is a subspace of M2x2. 

Ded! |e 
9 
b. E 3 |as 
c. Find a basis B for S. 
d. Give a 2 x 2 matrix that is not in S. 
4. Let S = {p(x) =a + bx + cx? |a +b +c = 0}. 

a. Show that S is a subspace of P2. 


abceR} 


b. Find a basis for S. Specify the dimension of S. 


5. Suppose that S = {v1, V2, V3} is a basis for a 
vector space V. 


a. Determine whether the set T = {v1, v1 + 
v2, Vi + V2 + V3} is a basis for V. 


b. Determine whether the set 
W = {-—vo + v3, 3V1 + 2v2 + V3, V1 — 
v2 + 2v3} is a basis for V. 
6. Let S = {v1, V2, V3}, where 


1 2 
—3 -1 


a. Explain why the set S is not a basis for R4. 


b. Show that v3 is a linear combination of vı and 
V2. 


c. Find the dimension of the span of the set S. 


d. Find a basis B for R4 that contains the vectors 
vı and vo. 


e. Show that 


N 
II 
oro 


is a basis for R4. 
f. Find the transition matrix from the ordered 
basis B to the ordered basis T. 


g. Use the matrix found in part (f) to find the 
transition matrix from the ordered basis T to 
the ordered basis B. 


h. If 


3.5 Application: Differential Equations 195 


a. Give a basis for V and find its dimension. Let 
S be the set of all matrices of the form 


[oa 


and let T be the set of all matrices of the form 


x y 
y z 
b. Show that S and T are subspaces of the vector 
space V. 
c. Give bases for S and T and specify their 


dimensions. 
d. Give a description of the matrices in S N T. 


1 Find a basis for $M T and give its dimension. 
ee ee 9. Let 
—2 uj vi 
5 u= | | and v= | | 
u2 U2 
find the coordinates of v relative to the ordered such that 
basis T. u-v=0 and y/u? tus =1l= op +o 


i. If 


a. Show that B = {u, v} is a basis for R?. 


—2 
Iv] 13 b. Find the coordinates of the vector w = | k | 
V = 
$ -5 relative to the ordered basis B. 
-l1 


find the coordinates of v relative to the ordered 
basis B. 

. Suppose span{v;,...,V,} = V and 
C1Vi + CoV2 + + H Cnn = 0 


with cı 4 0. Show that span{v2,...,v,} = V. 


10. 


Let c be a fixed scalar and let 
pi&)=1 po@)=xt+e 
p3(x) = (x +e)" 
a. Show that B = {pı (x), p2(x), p3(x)} is a basis 
for P2. 
b. Find the coordinates of f(x) = ao + 


8. Let V = Mox2. a,x + ax” relative to the ordered basis B. 


Chapter 3: Chapter Test 


In Exercises 1—35, determine whether the statement is 2. The set 
true or false. 1 2 0 
1. If V =R and addition and scalar multiplication S= 3- |; 1 |,) 4 
are defined as 1 —1 3 


x@®y=x+2y cOx=x+e is a basis for R°. 


then V is a vector space. 3. A line in R? is a subspace of dimension 1. 


196 


4. 


Chapter 3 Vector Spaces 


The set 


sm{[5 3 THLE 8} 


is a basis for M2x2. 


5. The set 
1 0 —2 
yaa a) fel E 0 
3 2 1 
is a basis for R? if and only if 
1 0 -2 
det} 2 1 0] +40 
3 2 1 
6. The set 


10. 


11. 


s{[s]p4 


is a subspace of R?. 


. The set 


S = {A € Mox2 | det(A) = 0} 


is a subspace of M2x2. 


. The set 


{2,1 +x,2— 39 x? —x +1} 


is a basis for P3. 


. The set 


{x3 — 2x? +1, x? — 4, x? + 2x, 5x} 


is a basis for P3. 


The dimension of the subspace 
s+2t 
S= t-s s,teR 
S 
of R? is 2. 
If i , 
ao 

and 

1 2 3 

E 4/7; 1pfp]5 


then span(S) = span(7). 


12. 


j 


14. 


16. 


18. 


19. 


20. 


21. 


22. 


23. 


3. 


The set 


2a 
S= a aceR 
0 


is a subspace of R? of dimension 1. 


If S = {v1, V2, v3} and T = {v1, V2, V3, V1 + Vo}, 
then span(S) = span(7). 


If S = {v1, Vo, v3}, T = {v1, Vo, V3, Vi + V2}, and 
S is a basis, then T is not a basis. 


. If {v1, V2, V3} is a basis for a vector space V and 


wi = Vi + 2v2 + v3, W2 = Vi + V2 + V3, 
w3 = Vi — V2 — V3, then W = {w1, Wo, W3} is 
also a basis for V. 


If V is a vector space of dimension n and S is a 
set of vectors that span V, then the number of 
vectors in S is less than or equal to n. 


. If V is a vector space of dimension n, then any 


set of n — 1 vectors is linearly dependent. 


If S and T are subspaces of a vector space V, 
then SU T is a subspace of V. 


If S = {vj,..., Vn} is a linearly independent set 
of vectors in R”, then S is a basis. 


If A is a3 x 3 matrix and for every vector 


a 
b= | b | the linear system Ax = b has a 
e 


solution, then the column vectors of A span R?. 


If an n x n matrix is invertible, then the column 
vectors form a basis for R”. 


If a vector space has bases S and T and the 
number of elements of S is n, then the number of 
elements of T is also n. 


In a vector space V, if 


span{v, V2, ..., Vn} = V 
and w1, W2,..., Wm are any elements of V, then 
span{v|, V2,..., Vn, Wi, W2,---, Wn} = V. 


24. If V is a vector space of dimension n and H is a 
subspace of dimension n, then H = V. 


25. If Bı and B2 are bases for the vector space V, 
then the transition matrix from Bı to Bz is the 
inverse of the transition matrix from 
By to By. 


In Exercises 26—29, use the bases of R? 


stn SES) 
mail El 


26. The coordinates of | : | relative to Bı, are 
1 
E 

27. The coordinates of | 


1l 1l 
[il 
28. The transition matrix from Bı to B is 
1| -1 3 
B2 _ = 
[7], tal 2 | 1 =] 
29. The transition matrix from Bz to Bı is 


mel | 


In Exercises 30—35, use the bases of P3, 


and 


a relative to B2 are 


Bı = {1, x, x”, x°} 


and 


30. 


31. 


32. 


33. 


34. 


35. 


3.5 Application: Differential Equations 


By = {x, x", 1, x°} 


[x3 + 2x? — x], 


[x3 + 2x? — x]s, = 


[x3 + 2x? — x]s 


[x3 + 2x? — x]B, 


[d +x)? —3(x? +x — 1) + x°]p, 


H = 


oro 


1 
2 
1 
1 
1 


1 


1 


= oo Ne 


7 
p 
a 


ooor 


ooro 


= Ooo 


197 


CHAPTER 


Linear Transformations 


[ CHAPTER OUTLINE 4.1 Linear Transformations 200 
4.2 The Null Space and Range 214 
4.3 Isomorphisms 226 
4.4 Matrix Representation of a Linear Transformation 235 
4.5 Similarity 249 
4.6 Application: Computer Graphics 255 


A critical component in the design of an airplane is 
the airflow over the wing. Four forces that act on 
an aircraft, and need to be considered in its design, 
are lift, the force of gravity, thrust, and drag. Lift and 
drag are aerodynamic forces that are generated by the 
movement of the aircraft through the air. During take- 
off, thrust from the engines must overcome drag. Lift, 
created by the rush of air over the wing, must over- 
come the force of gravity before the airplane can fly. 
Mathematical models developed by aeronautical engi- 
neers simulate the behavior of an aircraft in flight. These 
models involve linear systems with millions of 
equations and variables. As we saw in Chap. 1, lin- 
ear algebra provides systematic methods for solv- Roll Pitch 
ing these equations. Another use of linear algebra 

in the design process of an airplane is in modeling the movement of the aircraft 
through space. To check the feasibility of their designs, aeronautical engineers use 
computer graphics to visualize simulations of the aircraft in flight. Three control 
parameters which affect the position of an aircraft are pitch, roll, and yaw. The pitch 
measures the fore and aft tilt of an airplane, relative to the earth, while the roll mea- 
sures the tilt from side to side. Together these give the attitude of the aircraft. Using 
the figure above, the pitch is a rotation about the y axis, while a roll is a rotation 
about the x axis. The yaw measures the rotation about the z axis, and when combined 
with the pitch, gives the heading. During a simulation, the attitude and heading of 
the aircraft can be changed by applying a transformation to its coordinates relative 
to a predefined center of equilibrium. As we shall see in this chapter, such a trans- 
formation can be represented by matrix multiplication. Specifically, if the angles of 


199 


200 


Chapter 4 Linear Transformations 


4.1 > 


rotation for pitch, roll, and yaw are given by 9, ọ, and y, respectively, then the matrix 
representations for these transformations are given by 


cos8 0 —sin@ 1 0 0 cosy siny 0 
0 1 0 0 cosm —sing and —sinw cosy 0 
sind 0 cosé 0 sing cos@ 0 0 1 


This type of transformation is a linear map between vector spaces, in this case from 
R? to R3. The generation and manipulation of computer graphics are one of many 
applications that require the linear transformations which are introduced in this chapter. 


Due to their wide applicability linear transformations on vector spaces are of 
general interest and are the subject of this chapter. As functions between vector spaces, 
they are special since they preserve the additive structure of linear combinations. That 
is, the image of a linear combination under a linear transformation is also a linear 
combination in the range. In this chapter we investigate the connection between linear 
transformations and matrices, showing that every linear transformation between finite 
dimensional vector spaces can be written as a matrix multiplication. 


Linear Transformations 


In mathematics every line of inquiry ultimately leads to a description of some set and 
functions on that set. One may metaphorically refer to elements of the set as nouns 
and functions that operate on elements as verbs. In linear algebra the sets are vector 
spaces, which we discussed in Chap. 3, and linear transformations on vector spaces 
are the functions. 

If V and W are vector spaces, then a mapping T from V to W is a function that 
assigns to each vector v in V a unique vector w in W. In this case we say that T 
maps V into W, and we write T: V —> W. For each v in V the vector w = T (v) in 
W is the image of v under T. 


Define a mapping T: R? —> R? by 


(FS) eed 
y x—y 
a. Find the image of the standard basis vectors e; and e under the mapping T. 


b. Give a description of all vectors in R? that are mapped to the zero vector. 
c. Show that the mapping T satisfies 


T(u+v) =T(u)+T7 (vy) (preserves vector space addition) 


and 
T (cv) = cT (v) (preserves scalar multiplication) 


for all vectors u and v in V and all scalars c in R. 


4.1 Linear Transformations 201 


Solution a. Since e; = | 5 | and e2 = | r | we have 


ren =| and rea 5 | 


b. To answer this question, we solve 


D-r e 


This leads to the linear system 


gay = 

x-y =0 
where the unique solution is x = y = 0. Thus, the only vector that is mapped 
by T to | ; is the zero vector | ; |: 


c. To show that the mapping T preserves vector space addition, let 


me] and =|" 
u2 v2 


rarna Jele 
u2 v2 
2 ui + Vy 
z uz + v2 
2 | (u1 + v1) + (u2 + v2) | 
(ui + v1) — (u2 + v2) 


Then 


= T (u) +T(v) 


= cui 
reamr((23) 

| enit ci || uy + U2 

~ | cu — cuz F uy — u? 


= cT (u) 


We also have 


A mapping T between vector spaces V and W that satisfies the two properties, 
as in Example 1, 


T(u+v)= T(u)+T(v) and T (cu) = cT (u) 


202 Chapter 4 Linear Transformations 


is called a linear transformation from V into W. Notice that the operations of addition 
and scalar multiplication on the left-hand side of each equation refer to operations in 
the vector space V, and on the right-hand side refer to operations in the vector space W. 

Definition 1 combines the two requirements for the linearity of T into one 
statement. 


DEFINITION 1 LinearTransformation Let V and W be vector spaces. The mapping T: V > W 
is called a linear transformation if and only if 


T(cu+v) =cT(u) +7 (vy) 


for every choice of u and v in V and scalars c in R. In the case for which V = W, 
then T is called a linear operator. 


The mapping T defined in Example 1 is a linear operator on R*. In Example 2 
we show how matrices can be used to define linear transformations. 


| EXAMPLE2 | Let A be an m x n matrix. Define a mapping T: R” > R” by 


T(S) = Ax 


a. Show that T is a linear transformation. 
b. Let A be the 2 x 3 matrix 


Find the images of 


1 T 
1 and —1 
1 


under the mapping T: R? > R? with T(x) = Ax. 


Solution a. By Theorem 5 of Sec. 1.3, for all vectors u and v in R” and all scalars c in R, 
A(cu + v) = cAu+ Av 
Therefore, 


T(cu+v) =cT(u)+ T(v) 


b. Since T is defined by matrix multiplication, we have 


1 1 
l 2 =i 2 
r{ |i =| | l =| | 
l =I 3 2 i 4 


Solution 


and 


203 


4.1 Linear Transformations 


Later in this chapter, in Sec. 4.4, we show that every linear transformation between 
finite dimensional vector spaces can be represented by a matrix. In Examples 1 and 2, we 
have discussed some of the algebraic properties of linear transformations. In Example 3 
we consider the action of a linear transformation from a geometric perspective. 


Define a linear transformation T: R? — R? by 


a. 


. Describe the set 


x 
T 


Z 


of the equation 


T @ || 4b |] wl =i 
1 


. Find the image of the set 


. Find the image of the set 


S 
ll 
U œ 


= 
E st 


and find its image. 


-A 


. Discuss the action of T on a vector in R*, and give a geometric interpretation 


1 0 
0 +T 1 
1 1 
teR 

x,yeR 

x,zER 


The linear transformation T gives the projection, or shadow, of a vector in 


3-space to its image in the xy plane. Let 
1 0 

vV = (0) 

1 1 


and 


=a v2 = || | 


204 Chapter 4 Linear Transformations 


The images of these vectors are shown in Fig. 1. We see from the figure that 


T (v3) = | Ë | is equal to the vector sum T (v1) + T (v2) = | 3 | + | A | 


as desired. 


Figure 1 


1 
b. The set Sı is a line in 3-space with direction vector | 2 |. By the definition 


1 


which is a line in R? through the origin with slope 2. 


of T we have 


c. The set Sz is a plane in 3-space 3 units above and parallel to the xy plane. In 


this case, 
rsd ={| > | ryeR} 


Thus, the image of Sz is the entire xy plane, which from the description of T 
d. The set $3 is the xz plane. Here we have 


ros={[5]fee9 


which is just the x axis. Again, this is the expected result, given our description 
of T. 


as a projection is the result we expect. 


In Example 4 we use the derivative of a function to define a linear transformation 
between vector spaces of polynomials. 


Solution 


PROPOSITION 1 


4.1 Linear Transformations 


Define a mapping T: P3 —> P2 by 
T (p(x)) = p'(x) 


where p’(x) is the derivative of p(x). 
a. Show that T is a linear transformation. 
b. Find the image of the polynomial p(x) = 3x? + 2x? — x +2. 
c. Describe the polynomials in P3 that are mapped to the zero vector of P2. 


First observe that if p(x) is in P3, then it has the form 
p(x) = ax? + bx* +cx+d 
so that 
T (p(x)) = p' (x) = 3ax? + 2bx +c 
Since p’(x) is in P2, then T is a map from P} into P2. 


205 


a. To show that T is linear, let p(x) and g(x) be polynomials of degree 3 or less, 
and let k be a scalar. Recall from calculus that the derivative of a sum is the 
sum of the derivatives, and that the derivative of a scalar times a function is 


the scalar times the derivative of the function. Consequently, 
d 
T (kp) +4@)) = po + q(x)) 


= < lpw) a6 Law) 
= kp' (x) + q'(x) 
= kT(p(x)) + T(q@)) 
Therefore, the mapping T is a linear transformation. 
b. The image of the polynomial p(x) = 3x? + 2x? — x + 2 is 


d 
T(p(x)) = 5 Bx" + 2x? = 4-2 a= Íl 


c. The only functions in P3 with derivative equal to zero are the constant poly- 


nomials p(x) = c, where c is a real number. 


Let V and W be vector spaces, and let T: V —> W be a linear transformation. Then 


T(0) =0. 


Proof Since 7(0) = 7(0+0) and T is a linear transformation, we know that 
T(0) =7T(0+0) =7(0)+7(0). Subtracting T(0) from both sides of the last 


equation gives T(0) = 0. 


206 Chapter 4 Linear Transformations 


Solution 


Solution 


Solution 


Define a mapping T: R? —> R? by 


(3 )=(e) 


Determine whether T is a linear transformation. 


ro=r( o ae 


by the contrapositive of Proposition 1, we know that T is not a linear transfor- 
mation. 


Since 


Define a mapping T: Mmxn —> Mnxm by 
T(A)=A' 


Show that the mapping is a linear transformation. 


By Theorem 6 of Sec. 1.3, we have 
T(A + B) = (A+B) = A' + B' =T(A)+T(B) 
Also by this same theorem, 
T(cA) = (cA)! = cA’ = cT (A) 


Thus, T is a linear transformation. 


Coordinates Let V be a vector space with dim(V)=n, and B= 


{v1, V2,.--, Vn} an ordered basis for V. Let T: V —> R” be the map that sends a 
vector v in V to its coordinate vector in R” relative to B. That is, 
T (v) = [v]s 


It was shown in Sec. 3.4 that this map is well defined, that is, the coordinate vector 
of v relative to B is unique. Show that the map T is also a linear transformation. 


Let u and v be vectors in V and let k be a scalar. Since B is a basis, there are 
unique sets of scalars cj,..., Cn and dj,..., d, such that 


u=cvyj +-::+cnVn and v=dvy4+---+dnvn 


EXAMPLE 8 


Solution 


4.1 Linear Transformations 207 


Applying T to the vector ku + v gives 
T (ku + v) = T((key + d1)vi +--+ + (kcn + dn)Vn) 


key +d, C1 dı 
kca + d2 C2 dy 
= : =k) eae 
Ken Ar dn Cn dn 


=kT(u)+T (vy) 


Therefore, we have shown that the mapping T is a linear transformation. 


As mentioned earlier, when T; V —> W is a linear transformation, then the 
structure of V is preserved when it is mapped into W. Specifically, the image of a 
linear combination of vectors, under a linear map, is equal to a linear combination 
of the image vectors with the same coefficients. To see this, let V and W be vector 
spaces and T: V — W be a linear transformation. Then by repeated application of 
Definition 1, we have 


T(cyvy + c2V2 +++ + CnYn) = T(e1v1) +: + T(cnVn) 
= cT (v1) + cT (V2) +++ + CnaT (Vn) 


The fact that a linear transformation T between vector spaces V and W preserves 
linear combinations is useful in evaluating T when its action on the vectors of a basis 
for V is known. This is illustrated in Example 8. 


Let T: R? — R? be a linear transformation, and let B be the standard basis for R°. 


If 
T (e1) = | | T (e2) = | E | and T (e3) = | i | 
find T (v), where 


To find the image of the vector v, we first write the vector as a linear combination 
of the basis vectors. In this case 


v =e; + 3e2 + 263 


208 Chapter 4 Linear Transformations 


Applying T to this linear combination and using the linearity properties of T, we 


have 
T (v) = T (e; + 3e + 2e3) 


= T (e1) + 3T (e2) + 2T (e3) 


ibe | | 


| «EXAMPLE 9 | Let T: R? — R? be a linear operator and B a basis for RÌ given by 


1 1 1 
B= Ea eal | || i 
1 3 2 
If 
1 1 1 —1 1 2 
T 1 =| il T 2 a =2 T 1 = || 2 
1 1 3 —3 2 4 
find 
2 
T 3 
6 
Solution Since B is a basis for R?, there are (unique) scalars c1, c2, and c3 such that 
1 1 1 2 
C1 1 taol 2 IRG 1 = 3 
1 3 2 6 
Solving this ae we obtain cı = —1, c2 = 1, and c3 = 2. Hence, 
1 1 1 
T 1 |+]|2]|+2ļ|1 
1 3 2 
By the linearity of T, we have 
p 1 
T 3 =(-1)T +T 2 +2T 1 
6 3 2 
—1 2 
=— +) = 2 ea ee 
—3 4 


THEOREM 1 


4.1 Linear Transformations 209 


Operations with Linear Transformations 


Linear transformations can be combined by using a natural addition and scalar mul- 
tiplication to produce new linear transformations. For example, let S, T: R? > R? be 
defined by 


(EDP » GDP] 


(S+T)(v) =S(v)+T(v) and — (cS)(v) = c(S(y)) 


To illustrate this definition, let v = | E |: then 


($+ TH) = SW) + TW) = | gia + | a = | E 


—2 2+ 3(-1) 3 


For scalar multiplication let c = 3. Then 
(3T)(v) = 3T(v) = | È | = | 1> | 


In Theorem 1 we show that these operations on linear transformations produce 
linear transformations. 


Let V and W be vector spaces and let S, T: V > W be linear transformations. The 
function S + T defined by 


(S + T)(v) = S(v) + T(y) 

is a linear transformation from V into W. If c is any scalar, the function cS 

defined by 
(cS)(v) = cS(v) 

is a linear transformation from V into W. 

Proof Let u,v € V and let d be any scalar. Then 

(S+ T)(du+v) = S(du+v)+T(du+y) 

= S(du) + S(v) + T(du) + T (v) 
= dS(u) + S(v) + dT (U) + T (v) 
= d(S(u) + T(u)) + S(v) + T (v) 
=d(S+T)@m+(S+T)(v) 


210 


Chapter 4 Linear Transformations 


THEOREM 2 


STW) 


Figure 2 


so that $ + T is a linear transformation. Also 


(cS)(du + v) = c(S(du+ v)) 

= c(S(du) + S(v)) 
c(dS(u) + S(v)) 
(cd) S(u) + cS(v) 
= d(cS)(u) + (cS)(v) 


so that cS is a linear transformation. 


Using the sum of two linear transformations and the scalar product defined above, 
the set of all linear transformations between two given vector spaces is itself a vector 
space, denoted by £(U, V). Verification of this is left to Exercise 45 at the end of this 
section. 

As we saw in Example 2, every m x n matrix A defines a linear map from R” to 
R”. Also, if B is an n x p matrix, then B defines a linear map from R? to R”. The 
product matrix AB, which is an m x p matrix, then defines a linear transformation 
from R? to R”. As we shall see (in Sec. 4.4), this map corresponds to the composition 
of the maps defined by A and B. The desire for this correspondence is what motivated 
the definition of matrix multiplication given in Sec. 1.3. 


Let U, V, and W be vector spaces. If T: V —> U and S: U > W are linear trans- 
formations, then the composition map SoT: V —> W, defined by 


(ST) (v) = S(T(v)) 
is a linear transformation. (See Fig. 2.) 
Proof To show that SoT is a linear transformation, let vj and v2 be vectors in 
V and c a scalar. Applying SoT to cv, + v2, we obtain 
(SoT)(cv, + V2) = S(T (evı + v2)) 

= S(cT (v1) + T(v2)) 

= S(cT(v1)) + S(T (v2)) 
cS(T(v1)) + S(T (v2)) 
= c(SeT) (v1) + (SeT)(v2) 


This shows that So7 is a linear transformation. 


In the case of all linear operators on a vector space V, denoted by £(V, V), the 
operations of addition and scalar multiplication make £(V, V) a vector space. If, in 
addition, we define a product on £(V, V) by 


ST (v) = (ScT)(v) 


then the product satisfies the necessary properties making £(V, V) a linear algebra. 


4.1 Linear Transformations 211 


Fact Summary 


Let V, W, and Z be vector spaces and S and T functions from V into W. 
. The function T is a linear transformation provided that for all u, v in V and 
all scalars c, T (cu + v) = cT (u) + T (v). 


. If A is an m x n matrix and T is defined by T(x) = Ax, then T is a linear 
transformation from R” into R”. 


. If T is a linear transformation, then the zero vector in V is mapped to the 
zero vector in W, that is, T (0) = 0. 


. If B = {v1, Vo,..., Vn} is an ordered basis for V and W = R”, then the 
coordinate mapping T (v) = [v]z is a linear transformation. 


. If {vi, V2,..., Vn} is a set of vectors in V and T is a linear transformation, 

then 
T (civ. + C2V2 + +++ + CnVn) = C1T (v1) + CoT (V2) +--+ + nT (Vn) 

for all scalars c),..., Cn. 

. If S and T are linear transformations and c is a scalar, then S + T and cT 
are linear transformations. 

. If T: V — W is a linear transformation and L : W —> Z is a linear 
transformation, then LoT; V —> Z is a linear transformation. 


Exercise Set 4.1 


In Exercises 1—6, determine whether the function In Exercises 7—16, determine whether the function is a 
T: R? — R? is a linear transformation. linear transformation between vector spaces. 


al her 7. T: R > R, T(x) = x? 
y X 


8. T: R > R, T(x) = —2x 
9. T: RT (| > |) =+ 


x 
y? | 10. T: R > R?, 


x x 
y = d 11. T: R > R?, 


212 Chapter 4 Linear Transformations 


12. T: R > R?, 
x cos x 
T y = sin y 
Zz sinx + sinz 
13. T: P > P3, 


T (p(x)) = 2p" (x) — 3p'(x) + p(x) 
14. T: Pa > Ph, 


T(p(x)) = pœ) +x 
15. T: Məx2 > R, T(A) = det(A) 
16. T: Moy2 > M2x2, T (A) = At A’ 


In Exercises 17—20, a function T: V > W between 
vector spaces and two vectors u and v in V are given. 


a. Find T(u) and T(v). 
b. Is Tu+v) = T(u)+T(v)? 
c. Is T a linear transformation? 


17. Define T: R? > R? by 


Let 


“(4 


18. Define T: Pz > P2 by 


T (p(x)) = p(x) — 2p'(x) + p(x) 


Let 
u=x7-3x+1 v=-x-1 


19. Define T: P3 > R? by 


3 2 —a—b+1 
T (ax? + bx +or4+d=| aie 
Let 
u= —x? + 2x7 -—x+4+1 v=x? -1 


20. Define T: R? > R? by 


X 2 
x^— 1 
T = 
Tara 


Let 


= 
l 
= NI 


21. If T: R? — R? is a linear operator and 


(D-E 
mamar ([ 3) 


22. If T: R? — R? is a linear operator and 


1 1 
T 0 = p=] 
0 0 
0 2 
T 1 =| 0 
0 1 
0 1 
T 0 =| =) 
1 1 


1 
then find T 7 
5 


23. If T: P2 — P2 is a linear operator and 
Td) =1+x Tœ) =2+x? 
T (x?) = x — 3x? 
then find T(—3 + x — x’). 


24. If T: Mox2 —> M2x2 is a linear operator and 


25. 


26. 


27. 


28. 


then find 
2 1 
(Lt a]) 


Suppose that T: R? —> R? is a linear operator 


such that 
els 
(lo) 
Is it possible to determine T (| : $. If so, find 


it; and if not, explain why. 


Define a linear operator T: R? —> R? by 
T (u) = Au, where 


> 
| 
-= N e 
Ww” = N 
NULU 


a. Find T (e1), T (e2), and T (e3). 
b. Find T (3e; — 4e2 + 6e3). 


Suppose that T: P2 — P3 is a linear operator such 
that 


T(x”) =2x-1 T(—3x) =x? -1 


T(—x? 43x) = 2x? — 2x 41 


a. Is it possible to determine T (2x? — 3x + 2)? If 
so, find it; and if not, explain why. 

b. Is it possible to determine T x? — 4x)? If so, 
find it; and if not, explain why. 


Suppose that T: R? —> R? is a linear operator 
such that 
1 —1 
T 0 = 
0 3 
2 
T 1 =| -2 
1 


29. 


30. 


31. 


32. 


33. 


213 


4.1 Linear Transformations 


1 8 
T 3 = | —10 
0 =3 
a. Find 
2 
T —5 
0 


b. Is it possible to determine T (v) for all vectors 
v in RÌ? Explain. 


Define a linear operator T: R? — R? by 


“(Daley 
y a 
a. Find a matrix A such that T (v) = Av. 
b. Find T (e1) and T (e2). 


Define a linear transformation T: R? > R? by 
x— 2y 
piè 
2y 
a. Find a matrix A such that T (v) = Av. 
b. Find T (e) and T (e2). 
Define T: R? > R? by 
x 
T y = | 


z] 
z 


x—y 
Find all vectors that are mapped to 0. 
Define T: R? —> R? by 
x 
T y = | 


=] 
z 


—x+5y+z 


Find all vectors that are mapped to 0. 


Define T: R? > R? by 


x x— y+2z 
T y = 2x +3y— z 
z = x+2y—2z 


a. Find all vectors in R? that are mapped to the 
zero vector. 


214 


34. 


35. 


36. 


37. 


38. 


Chapter 4 Linear Transformations 


7 39. Define T: C®[0, 1] > R by 
b. Let w= | —6 |. Determine whether there is 1 
-9 rin= | fdr 
0 


a vector v in R? such that T (v) = w. 
Define T: Pa > P by 
T (p(x)) = p'(x) — pO) 


a. Find all vectors that are mapped to 0. 

b. Find two polynomials p(x) and q(x) such that 
T(p(x)) = T(q(x)) = 6x — 3. 

c. Is T a linear operator? 


Suppose Ti: V > R and D: V — R are linear 
transformations. Define T: V > R? by 


40. 


41. 


for each function f in C ©[0, 1]. 
a. Show that T is a linear operator. 
b. Find T (2x? — x + 3). 


Suppose that T: V —> W is a linear transformation 
and T (u) = w. If T (v) = 0, then find T (u + v). 


Suppose that T: R” — R” is a linear 
transformation and {v, w} is a linearly 
independent subset of R”. If {T (v), T (w)} is 
linearly dependent, show that T(u) = 0 has a 
nontrivial solution. 


r 42. Suppose that T: V — V is a linear operator and 
T(v) = | 1 0) {V1,.--, Vn} is linearly dependent. Show that 
2(¥) {T(v1),..., T (Vn)} is linearly dependent. 


Show that 7 is a linear transformation. 


Define T: Mnxn > R by T(A) = tr(A). Show 
that T is a linear transformation. 


Suppose that B is a fixed n x n matrix. Define 
T: Mnxn > Mnxn by T(A) = AB — BA. Show 
that T is a linear operator. 


Define T: R > R by T(x) = mx + b. Determine 
when T is a linear operator. 


43. 


44. 


45. 


Let S = {v1, V2, V3} be a linearly independent 
subset of R°. Find a linear operator T: R? —> R3, 
such that {T (v1), T (v2), T (v3)} is linearly 
dependent. 


Suppose that 7;: V — V and D: V > V are 
linear operators and {vj,..., Vn} is a basis for V. 
If 7,(v;) = To(v;), for each i = 1,2,...,n, show 
that Tı (v) = 72(v) for all v in V. 


Verify that £(U, V) is a vector space. 


4.2 > The Null Space and Range 


In Sec. 3.2, we defined the null space of an m x n matrix to be the subspace of R” of 
all vectors x with Ax = 0. We also defined the column space of A as the subspace of 
R” of all linear combinations of the column vectors of A. In this section we extend 
these ideas to linear transformations. 


DEFINITION 1 


Null Space and Range Let V and W be vector spaces. For a linear transfor- 


mation T: V —> W the null space of T, denoted by N(T), is defined by 


N(T) = {ve V |Tv) =9} 
The range of T, denoted by R(T), is defined by 
R(T) = {T(v) |ve V} 


THEOREM 3 


4.2 The Null Space and Range 215 


The null space of a linear transformation is then the set of all vectors in V that are 
mapped to the zero vector, with the range being the set of all images of the mapping, 
as shown in Fig. 1. 


In Theorem 3 we see that the null space and the range of a linear transformation 
are both subspaces. 


Let V and W be vector spaces and T: V —~> W a linear transformation. 


1. The null space of T is a subspace of V. 
2. The range of T is a subspace of W. 


Proof (1) Let vı and vz be in N(T), so that T(v,) = 0 and T(v2) = 0. If c is a 
scalar, then using the linearity of T, we have 


T (evı + V2) =cT(v1) + T(v2) = c0+0=0 


Thus, cv; + v2 is in N(T), and by Theorem 4 of Sec. 3.2, N(T) is a subspace 
of V. 


(2) Let w, and w, be in R(T). Then there are vectors vı and vz in V such that 
T (v1) = wı and T(v2) = w2. Then for any scalar c, 


T (cv, + V2) = cT (v1) + T (v2) = cw, + Wo 
so that cw; + w2 is in R(T) and hence R(T) is a subspace of W. 


Define the linear transformation T: Rt — R? by 


a. Find a basis for the null space of T and its dimension. 


216 Chapter 4 Linear Transformations 


Solution 


b. Give a description of the range of T. 
c. Find a basis for the range of T and its dimension. 


a. The null space of T is found by setting each component of the image vector 


equal to 0. This yields the linear system 


a+b = 
b-c —a() 
a +d =0 


This linear system has infinitely many solutions, given by 


SIE 


S= teR 
Hence, 
=j ] 
1 
N (T) = span 1 
1 
A basis for N (T) consists of the one vector 
—1 
1 
1 
1 


Consequently, dim(N (T)) = 1. 


. Observe that any vector in the range can be written as 


1 1 0 0 
a|O|+b}] 1 |+c| -1 | +d] 0 
1 0 0 1 
for some real numbers a, b, c, and d. Therefore, 
1 1 0 0 
R(T) = span OR ele ak |i, || @ 
1 (0) 0 1 


. Since the range is a subspace of R?, it has dimension less than or equal to 3. 


Consequently, the four vectors found to span the range in part (b) are linearly 
dependent and do not form a basis. To find a basis for R(T), we use the 
trimming procedure given in Sec. 3.3 and reduce the matrix 


1 1 0 0 1 @ @ 1 
i =i © to @ it @ =i 
10 01 O @ i =i 


4.2 The Null Space and Range 217 


Since the reduced matrix has pivots in the first three columns, a basis for the 
range of T is 


1 1 0 
B= @ || i |, =! 
1 0 0 


Therefore, dim(R(T)) = 3. Observe that B also spans R?, so that R(T) = RÈ. 


| «EXAMPLE 2 | Define the linear transformation T: P4 — P3, by 


Solution 


THEOREM 4 


T (p(x)) = p(x) 
Find the null space and range of T. 


Recall that the derivative of a constant polynomial is 0. Since these are the only 
polynomials for which the derivative is 0, we know that N(T) is the set of constant 
polynomials in P4. We claim that the range of T is all of P3. To see this, let 
q(x) = ax? + bx? +cx +d be an arbitrary element of P3. A polynomial p(x) 
whose derivative is g(x) is found by using the antiderivative. That is, to find p(x), 
we integrate q(x) to obtain 


3 


which is an element of P4, with p’(x) = q(x). This shows that for every polynomial 
q(x) in P3 there is a polynomial p(x) in P4 such that T(p(x)) = q(x), giving that 
the range of T is all of P3. 


b 
pas) = fads = [ (ax? + bx? box +d) dx = Ext tx ta drte 


In Sec. 4.1, we saw that the image of an arbitrary vector v € V can be computed if 
the image T (v;) is known for each vector v; in a basis for V. This leads to Theorem 4. 


Let V and W be finite dimensional vector spaces and B = {v1, V2,..., Vn} a basis 
for V. If T: V — W is a linear transformation, then 


R(T) = span{T (v1), T (v2), ..., T (Vn)} 


Proof To show that the two sets are equal, we will show that each is a subset of 
the other. First, if w is in R(T), then there is a vector v in V such that T (v) = w. 
Now, since B is a basis for V, there are scalars c1, ..., Cy, with 

V = C1 V1 + C2V2 +--+ F CnVn 


so that 
T (v) = T (c1V1 + C2V2 + +++ + CnVn) 


218 Chapter 4 Linear Transformations 


From the linearity of T, we have 
w = T(v) = cT (v1) + eT (V2) + +++ + nT (Vn) 


As w is a linear combination of T (v1), T(v2),..., T (Vn), then w € span{T (vj), 
T (v2), ..., T'(V,)}. Since this is true for all w in R(T), then 


R(T) C span{T (v1), T (v2), ..., T (Vn)} 


On the other hand, suppose that w € span{T (v1), T (v2), ..., T (Vn)}. Then 
there are scalars c),..., Cn, with 


w= cT (yy) + eT (V2) +: + cnT (Vn) 
= T (c1V1 + C2V2 +--+ + CnVn) 


Therefore, w is the image under T of c1V1 + CoV2 +--+ + CnVn, which is an element 
of V. Therefore, span{T (v1), T (v2), ..., T(Wn)} C R(T). 


| EXAMPLE3 | Let T: R? — R? be a linear operator and B = {v1, V2, V3} a basis for R*. Suppose 


that 
1 1 2 
T (v1) = 1 T (v2) = 0 T (v3) = 1 
0 -l —1 
1 
a. Is | 2 | in R(T)? 
1 


b. Find a basis for R(T). 


c. Find the null space N (T). 
1 


Solution a. From Theorem 4, the vector w= | 2 | isin R(T) if there are scalars c1, co, 


and c3 such that 


1 
caT + c2T (v2) + c3T (v3) = | 2 
1 


that is, 
1 1 2 1 
ei Il || +e 0 |} +c 1)= | 2 
0 —1 —1 1 


The set of solutions to this linear system is given by S = {(2 — t,—1 —f, t) | 


t € R}. In particular, if t = 0, then a solution is cı = 2, cp = —1, and c3 = 0. 
Thus, w € R(T). 


4.2 The Null Space and Range 219 


b. To find a basis for R(T), we row-reduce the matrix 


1 1 2 1 01 
1 0 1 to obtain O 1 1 
@ =i =i 0 0 0 
Since the leading 1s are in columns 1 and 2, a basis for R(T) is given by 
1 1 
R(T) = span lL, 0 
0 —1 


Observe that since the range is spanned by two linearly independent vectors, 
R(T) is a plane in R, as shown in Fig. 2. 
c. Since B is a basis for R3, the null space is the set of all vectors c1Vı + coV2 + 
c3V3 such that 
0 
ciT (v1) + c2T (v2) + c3T (v3) = | 0 
0 


By using the reduced matrix 


1 01 
0 1 1 
0 0 O 
from part (b), the null space consists of all vectors such that cj = —c3, cz = 


—c3, and c3 is any real number. That is, 
N(T) = span {—v, — v2 + v3} 
which is a line in R? See Fig. 2. 


Figure 2 


Notice that in Example 3 we have 
dim(R*) = dim(R(T)) + dim(N(T)) 


In Theorem 5 we establish this fundamental result for all linear transformations 
between finite dimensional vector spaces. 


220 


Chapter 4 Linear Transformations 


THEOREM 5 


Let V and W be finite dimensional vector spaces. If T: V —> W is a linear 
transformation, then 


dim(V) = dim(R(T)) + dim(N(T)) 


Proof Suppose that dim(V) = n. To establish the result, we consider three cases. 

First, suppose that dim(V(T)) = dim(V) = n. In this case, the image of every 
vector in V is the zero vector (in W), so that R(T) = {0}. Since the dimension of 
the vector space containing only the zero element is 0, the result holds. 

Now suppose 1 < r = dim(N(T)) < n. Let {v1, V2,..., Vp} be a basis for N (T). 
By Corollary 1 of Sec. 3.3, there are n —r vectors {V;+1, Vr+2,---, Vn}, such 
that {V1, V2, ..., Vr, Vr+1,---, Vn} is a basis for V. We claim that S = {T(v,+1), 
T(V;+42),---, T (Vn)} is a basis for R(T). By Theorem 4, we have 


R(T) = span{T (v1), T(v2),..., T (vr), T(vr4i),---, T(Vn)} 


Since T (v1) = T(v2) =--- = T (v,) = 9, each vector in R(T) is a linear combina- 
tion of T(v;+1),..., T (Vn) and hence R(T) = span(S). To show that S is linearly 
independent, we consider the equation 


Cri T (Vr41) + Cr42T (Vr+2) erani CnT (Vn) = 0 


We need to show that c,41 = c;42 = +++ = Cn = Q. Since T is linear, the previous 
equation can be written as 


T (Cr41Vr+1 + Cr42Vr2 Fe t CnYn) =0 


From this last equation, we have c;+1V;+41 + Cr42Vr+2 + +° + CnVn is in N(T). 
However, since {v1, V2,..., V+} is a basis for N (T), there are scalars cj, c2,..., Cy 
such that 


Crt iVr+1 + Cr42Vr42 Fe + CnVn = C1V1 + CoV2 F ee + CrV 


that is, 

civ1 c2Vv2 ad CrVp F Cr41Vr+1 + Cr42Vr4+2 +++ + CnVn = 0 
Now, since {Vj, V2,..., Vr, Vr-+1,---, Vn} is a basis for V and hence linearly inde- 
pendent, the coefficients of the last equation must all be 0, that is, c1 = c2 =--- = 
Cr = C41 = +++ = Cn =O. In particular, c,4) = cp42 = +--+: = Cn = 0. Hence, the 
n—r vectors T(V;+1),..., [ (Vy) are a basis for R(T). Consequently, 


n=dim(V) = (n — r) +r = dim(R(7)) + dim(N(T)) 
Finally, suppose that N(T) = {0}, so that dim(N(T)) = 0. If {vj,..., Vn} is a 
basis for V, then by Theorem 4 we have 
R(T) = span{T (v1), ..., T(Vn)} 


A similar argument to the one above shows that {T (v1), ..., 7(Vn)} is linearly 
independent. Thus, dim(R(7)) = n = dim(V), and the result also holds in this 
case. 


Solution 


4.2 The Null Space and Range 221 


Define a linear transformation T: P4 —> P2 by 
T (pa) = p" Œ) 


Find the dimension of the range of T, and give a description of the range. 


Let B = {1, x, x”, x3, x4} be the standard basis for P4. Since p(x) is in N (T) if and 
only if its degree is 0 or 1, the null space is the subspace of P4 consisting of polyno- 
mials with degree 1 or less. Hence, {1, x} is a basis for N(T), and dim(N (T)) = 2. 
Since dim(P4) = 5, by Theorem 5 we have 


2+ dim(R(T)) =5 so dim(R(T)) = 3 
Then as in the proof of Theorem 5, we have 
{T (x*), TO’), Te) = {2, 6x, 12x°} 
is a basis for R(T). Observe that R(T) is just the subspace P2 of P4. 


Matrices 


In Sec. 3.2 we defined the column space of a matrix A, denoted by col(A), as the 
span of its column vectors. We also defined the null space of the m x n matrix A as 
the set of all vectors x in R” such that Ax = 0. We further examine these notions here 
in the context of linear transformations. In particular, let A be an m x n matrix and 
let T: R” — R” be the linear transformation defined by 


T (v) = Av 
This last equation can be written in vector form as 
T(V) = vj Aq + vpAg +: + + Un An 


where A; are the column vectors of A, and v; are the components of v for 1 <i <n. 
In this way we see that the range of T, which is a subspace of R”, is equal to the 
column space of A, that is, 
R(T) = col(A) 
The dimension of the column space of A is called the column rank of A. We also have 
N(T) = {v € R” | Av = 0} = N(A) 
The dimension of N(A) is called the nullity of A. Applying Theorem 5, we have 
column rank(A) + nullity(A) = n 


Another subspace of IR” associated with the matrix A is the row space of A, 
denoted by row(A), and is the span of the row vectors of A. Since the transpose 
operation maps the row vectors of A to the column vectors of A’, the row space of 
A is the same as the column space of A’, that is, 


row(A) = col(A’‘) 


222 


Chapter 4 Linear Transformations 


THEOREM 6 


THEOREM 7 


By using the algorithm for finding a basis, given in Sec. 3.3, a basis for col(A) can 
be found by row reduction. In particular, the columns with a leading 1 in the row- 
reduced form of A correspond to the column vectors of A needed for a basis of col(A). 
Hence, the column rank of A is equal to the number of columns containing a leading 1 
in the row-reduced form of A. On the other hand, row-reducing A eliminates row 
vectors that are linear combinations of the others, so that the nonzero row vectors of 
the reduced form of A form a basis for row(A). Hence, the row rank is equal to the 
number of columns containing a leading 1 in the reduced form of A. We have now 
established Theorem 6. 


The row rank and the column rank of a matrix A are equal. 


We can now define the rank of a matrix A as dim(row(A)) or dim(col(A)). 
Again by Theorem 5, we have 


rank(A) + nullity(A) = n 


Linear Systems 


When the nullity of a matrix A is known, the above formula can sometimes be used to 
determine whether the linear system Ax = b is consistent. For example, suppose that 
a linear system consists of 20 equations each with 22 variables. Further suppose that a 
basis for the null space of the 20 x 22 coefficient matrix consists of two vectors. That 
is, every solution to the homogeneous linear system Ax = 0 is a linear combination 
of two linearly independent vectors in R**. Then nullity(A) = 2, so that 


dim(col(A)) = rank(A) = 22 — nullity(A) = 20 


But the only subspace of R with dimension 20 is R” itself. Hence, col(A) = R”, 
and consequently every vector b in R” is a linear combination of the columns of A. 
That is, the linear system Ax = b is consistent for every vector b in R”. In general, 
if A is an m x n matrix, nullity(A) = r, and dim(col) = n — r = m, then the linear 
system Ax = b is consistent for every vector b in R”. 

We now add several more items to the list of equivalences given in Theorem 9 
of Sec. 2.3, connecting solutions of the linear system Ax = b and properties of the 
coefficient matrix A. 


Let A be ann x n matrix. Then the following statements are equivalent. 


1. The matrix A is invertible. 

2. The linear system Ax = b has a unique solution for every vector b. 
3. The homogeneous linear system Ax = 0 has only the trivial solution. 
4. The matrix A is row equivalent to the identity matrix. 


4.2 The Null Space and Range 223 


. The determinant of the matrix A is nonzero. 
. The column vectors of A are linearly independent. 


. The column vectors of A are a basis for R”. 
. rank(A) =n 
10. R(A) = col(A) = R” 
11. N(A) = {0} 
12. row(A) = R” 
13. The number of pivot columns of the reduced row echelon form of A is n. 


5 
6 
7. The column vectors of A span R”. 
8 
9 


Fact Summary 


Let V and W be vector spaces and T a linear transformation from V into W. 


1. The null space N (T) is a subspace of V, and the the range R(T) is a 
subspace of W. 
2. If B = {vj,..., Vn} is a basis for V, then 


R(T) = span{T(v,),..., T(v,,)} 


. If V and W are finite dimensional vector spaces, then 
dim(V) = dim(R(T)) + dim(N (T)) 
. If A is an m x n matrix, then 
rank(A) + nullity(A) = n 


. If A is an m x n matrix, then the rank of A is the number of columns 
containing a leading | in the row-reduced form of A. 


. If Ais ann x n invertible matrix, in addition to Theorem 9 of Sec. 2.3, we 
know that rank(A) = n, R(A) = col(A) = R”, N(A) = {0}, and the number 
of columns containing a leading 1 in the row echelon form of A is n. 


Exercise Set 4.2 


In Exercises 1—4, define a linear operator T: R? + R? Determine whether the vector v is in N (T). 


nDl ee] wr 


224 Chapter 4 Linear Transformations 


Bleue We FN 


In Exercises 5—8, define a linear operator 
T: P3 > P3 by 
T (p(x)) = xp" (x) 


Determine whether the polynomial p(x) is in N(T). 


5. p(x) =x*-—3x4+1 
6. p(x) =5x +2 

7. p(x) =1—<x? 

8. p(x) =3 


In Exercises 9—12, define a linear operator 
T: R? > R? by 


x x+ 2z 
T y =|] 2x+y+3z 
Z x—y+3z 


Determine whether the vector v is in R(T). 


1 
9. v=] 3 
0 
2 
10. v= | 3 
4 
—1 
11. v= 1 
—2 
—2 
12. v= | —5 
—1 


In Exercises 13—16, define a linear transformation 
T: M2x2 > M3x2 by 


= | -—a+2c —b+2d 


| a+ c b+ d 
2a 2b 


Determine whether the matrix A is in R(T). 


-1 -l1 
13. A= | -5 -2 
2 0 
1 2 
14. A= 3 -3 
—2 2 
1 0 
15. A=]|2 1 
4 0 
4 1 
16. A= | —! 5 
6 -2 


In Exercises 17—24, find a basis for the null space of 
the linear transformation T. 


17. T: R > R’, 
a DAR 
y y 
18. T: R? > RÈ, 
"(US D-135] 
y x—y 
19. T: R? > R’, 
x x +2z 
T y = 2x+y+3z 
zZ x— y+3z 
20. T: R? > R?, 
x —2x +2y4+2z 
T = 3x +5y+z 
Zz 2y+z 
21. T: R > R, 
x x—2y- z 
T y =] -x+2y+ z 
Zz 2x — 4y — 2z 
22. T: Rf > R, 
x 
X+ry-Z4rw 
T y = | 2x+y+4z+w 
a 3x + y + 9z 


23. T: Pz = R, 
T(p(x)) = pO) 


24. T: Pa > Po, 
T (p(x)) = p" (x) 


In Exercises 25—30, find a basis for the range of the 
linear transformation T. 


25. T: R? > R?, 
1 1 
Tw =|0 1 -1 |v 
2 0 
26. T: R5 > R$, 
1 —2 -3 1 5 
T(v)=}3 -l 10 4jv 
1 1 3 1 2 
27. T: R? > R?, 
Xx X 
T y =| y 
Z 0 
28. T: R? > R?, 
x x-— y+3z 
T y = x+ yt Z 
Z —x +3y-— 5z 
29. T: P3 > P3, 


T (p(x)) = p"œ) + p(x) + p0) 


30. T: P2 > Po, 
T (ax? + bx +c) = (a +b)x? + cx + (a+b) 


31. Let T: R? — R? be a linear operator and 
B = {v1, V2, V3} a basis for R*. Suppose 


—2 0 
T (v1) = 1 T (v2) = 1 
1 —1 

—2 

T (v3) = 2 


32. 


33. 


34. 


35. 


36. 


4.2 The Null Space and Range 225 


a. Determine whether 


is in the range of T. 
b. Find a basis for R(T). 
c. Find dim(N(T)). 


Let T: R3 > R? be a linear operator and 
B = {v1, V2, V3} a basis for R*. Suppose 


=j 0 
Tiviy)=| 2 T(v.) =| 5 
1 0 

=] 

Taye: | 21 

2 


a. Determine whether 


= 
II 
NOR 


is in the range of T. 
b. Find a basis for R(T). 
c. Find dim(N(T)). 


Let T: P2 > P2 be defined by 
T (ax? + bx +c) = ax? + (a — 2b)x +b 


a. Determine whether p(x) = 2x? — 4x + 6 is in 
the range of T. 


b. Find a basis for R(T). 
Let T: P2 —> P2 be defined by 
T (ax? + bx +c) = cx? +bx — b 


a. Determine whether p(x) = x? — x — 2 is in the 
range of T. 


b. Find a basis for R(T). 


Find a linear transformation T: R? > R? such 
that R(T) = R?. 


Find a linear operator T: R? —> R? such that 
R(T) = N(T). 


226 Chapter 4 Linear Transformations 


37. Define a linear operator T: Pa > P, by Find a basis for the null space of T. 
TPG) = p'a) 42. Define T: Mpxn > M,xn by T(A) = At. Show 
a. Describe the range of T. that R(T) = Mnxn. 
b. Find dim(R(T)). 43. Define T: Mpxn > Mnxn by T(A) = A + A'. 
38. Define a linear operator T: Pa, > Py by b. Find N(T). 
dk 
T(p(x)) = Fak (P@) 44. Define T: Maxn > Mnxn by T(A) = A — A’. 
x 


a. Find R(T). 


where 1 < k <n. Show dim(N(T)) =k. ; 
b. Find N(T). 


39. Suppose T: Rt — R° is a linear transformation. 


a. If dim(N(T)) = 2, then find dim(R(T)). 45. Let A be a fixed n x n matrix, and define 
b. If dim(R(T)) = 3, then find dim(N(T)). T: Mnxn > Mnxn by T(B) = AB. When does 
R(T) = Manxa? 
40. Show that if T: V — V is a linear operator such 
that R(T) = N(T), then dim(V) is even. 46. Let A be a fixed n x n diagonal matrix, and 
41. Let define T: R” — R” by T(v) = Av. 
A= 1 0 a. Show dim(R(T)) is the number of nonzero 
~1|0 -1 entries on the diagonal of A. 
Define T: Mzx2 > M2x2 by b. Find dim(N(T)). How is it related to the 
T(B) = AB — BA diagonal terms of the matrix A? 


4.3 > Isomorphisms 


Many of the vector spaces that we have discussed are, from an algebraic perspective, 
the same. In this section we show how an isomorphism, which is a special kind of 
linear transformation, can be used to establish a correspondence between two vector 
spaces. Essential to this discussion are the concepts of one-to-one and onto mappings. 
For a more detailed description see App. A, Sec. A.2. 


DEFINITION 1 One-to-One and Onto Let V and W be vector spaces and T: V > W a 
mapping. 
1. The mapping T is called one-to-one (or injective) if u 4 v implies that T (u) Æ 
T(v). That is, distinct elements of V must have distinct images in W. 
2. The mapping T is called onto (or surjective) if T(V) = W. That is, the range 
of T is W. 


A mapping is called bijective if it is both injective and surjective. 


When we are trying to show that a mapping is one-to-one, a useful equiva- 
lent formulation comes from the contrapositive statement. That is, T is one-to-one if 
T (u) = T (v) implies that u = v. To show that a mapping is onto, we must show that 
if w is an arbitrary element of W, then there is some element v € V with T(v) = w. 


4.3 Isomorphisms 227 


| «EXAMPLE 1 | Let T: R? —> R? be the mapping defined by T(v) = Av, with 


i il 
a= [1 0) 
Show that T is one-to-one and onto. 


Solution First, to show that T is one-to-one, let 


me | and ee | 
u? U2 


Then 


and 


az 1 1 Cy | || Bar ve 
roca o E 


Equating the second components gives u; = vı, and using this when equating the 
first components gives u2 = v2. Thus, u = v, establishing that the mapping is one- 
to-one. 


s a 
Next, to show that T is onto, let w = | 


b | be an arbitrary vector in R?. We 


2 v 5 
must show that there is a vector v = | a | in R? such that 
2 


ro= [1 aA 


Applying the inverse of A to both sides of this equation, we have 


aJ=[r ael ees e L ese] 


Thus, T is onto. For example, let w = | 2 


} then using the above formula for the 


=2 


preimage, we have v = | | = | E } As verification, observe that 


1 1 —2 1 
ro=| 1 ol E 
An alternative argument is to observe that the column vectors of A are linearly 


independent and hence are a basis for R?. Therefore, the range of T being the 
column space of A is all of R?. 


228 Chapter 4 Linear Transformations 


THEOREM 8 


Solution 


Theorem 8 gives a useful way to determine whether a linear transforation is 
one-to-one. 


The linear transformation T: V —> W is one-to-one if and only if the null space 
of T consists of only the zero vector of V. 


Proof First suppose that T is one-to-one. We claim that N(T) = {0}. To show 
this, let v be any vector in the null space of T, so that T(v) = 0. We also have, 
by Proposition 1 of Sec. 4.1, T (0) = 0. Since T is one-to-one, then v = 0, so only 
the zero vector is mapped to the zero vector. 

Now suppose that N(T) = {0} and 


T (u) = T (v) 


Subtracting T (v) from both sides of the last equation and using the linearity of T, 
we obtain 

T(u) — T(v) = 0 so that T(u—v)=0 
Thus, u — v € N(T). Since the null space consists of only the zero vector, u — v = 
0, that is, u = vV. 


Define a linear operator T; R? — R? by 


T x _ || 2e= ay 
y ~ | 5x +2y 
Use Theorem 8 to show that T is one-to-one. 


The vector | ; | is in the null space of T if and only if 


2x —3y=0 
5x +2y =0 


This linear system has the unique solution x = y = 0. Thus, N(T) = {0} and hence 
by Theorem 8, T is one-to-one. 


The mapping of Example 2 can alternatively be defined by using the matrix 
2 =3 
e 

so that T(x) = Ax. Since det(A) Æ 0, then A is invertible. This allows us to show 
that the map is also onto. Indeed, if b is any vector in R?, then x = A~'b is the vector 
in the domain that is the preimage of b, so that T is onto. 

In Theorem 4 of Sec. 4.2, we showed that if T: V —> W is a linear transfor- 
mation between vector spaces, and B = {v1, ..., Vn} is a basis for V, then R(T) = 


span{T (vı), ..., T(v,)}. If, in addition, the transformation is one-to-one, then the 
spanning vectors are also a basis for the range, as given in Theorem 9. 


THEOREM 9 


DEFINITION 2 


PROPOSITION 2 


4.3 Isomorphisms 229 


Suppose that T: V — W is a linear transformation and B = {vj,..., Vn} is a basis 
for V. If T is one-to-one, then {T (v1), ..., T(V,)} is a basis for R(T). 


Proof By Theorem 4 of Sec. 4.2, we know that the span{T (v1), ..., T (Vn) = 
R(T), so it suffices to show that {T(v1),..., T (Vn)} is linearly independent. To do 
so, we consider the equation 


cT (v1) + coT (v2) +--+ + enT (Vn) = 9 
which is equivalent to 
T (civ, + c2V2 +--+ H Cnn) = 90 
Since T is one-to-one, the null space consists of only the zero vector of V, so that 
C1Vvy + C2V2 +--+ + Cnn = 9 


Finally, since B is a basis for V, it is linearly independent; hence 


C=C =: =e, =0 


Therefore, {T (v1), ..., T (Vn)} is linearly independent. 


We note that in Theorem 9 if T is also onto, then {T (v1), ..., T (Vn)} is a basis 
for W. 
We are now ready to define an isomorphism on vector spaces. 


Isomorphism Let V and W be vector spaces. A linear transformation T: V —> 
W that is both one-to-one and onto is called an isomorphism. In this case the vector 
spaces V and W are said to be isomorphic. 


Proposition 2 builds on the remarks following Example 2 and gives a useful 
characterization of linear transformations defined by a matrix that are isomorphisms. 


Let A be ann x n matrix and T: R” —> R” be the mapping defined by T(x) = Ax. 
Then T is an isomorphism if and only if A is invertible. 


Proof Let A be invertible and b be any vector in R”. Then x = A~'b is the 
preimage of b. Thus, the mapping T is onto. To show that T is one-to-one, observe 
that by Theorem 10 of Sec. 1.5 the equation Ax = 0 has only the solution x = 
A~'0 = 0. Thus, by Theorem 8, the mapping T is one-to-one and hence is an 
isomorphism from R” onto R”. 

Conversely, suppose that T is an isomorphism. Then T: R” — R” is onto, 
with the column space of A being R”. Hence, by Theorem 7 of Sec. 4.2 the matrix 
A is invertible. 


230 


Chapter 4 Linear Transformations 


THEOREM 10 


DEFINITION 3 


Theorem 10 is of fundamental importance to the study of finite dimensional vector 
spaces and is the main result of the section. 


If V is a vector space with dim(V) = n, then V and R” are isomorphic. 


Proof Let B = {v1,..., Vn} be an ordered basis for V. Let T: V —> R” be the 
coordinate transformation defined by T (v) = [v]g,, first introduced in Example 7 of 
Sec. 4.1. We claim that T is an isomorphism. First, to show that T is one-to-one, 


suppose that T(v) = 0. Since B is a basis, there are unique scalars c1, ..., Cn such 
that 
V=HcC1Vvj +: + Cnn 

Thus, 

Ci 0 

C2 0 

T (v) = [vz = : = ; 

Ch (0) 

so that c1 = cp = - - - = Cn = 0 and v = 0. Therefore, N (T) = {0}, and by Theorem 


8, T is one-to-one. 
Now, to show that T is onto, let 


be a vector in R”. Define v in V by v = kivi +--- + knVn. Observe that T (v) = w 
and hence T is onto. Therefore, the linear transformation T is an isomorphism, and 
V and R” are isomorphic vector spaces. 


So far in our experience we have seen that dim(P2) = 3 and dim(S2,2) = 3, 
where S2x2 is the vector space of 2 x 2 symmetric matrices. Consequently, by Theorem 
10, the vector spaces P and S2x2 are both isomorphic to R3, where the isomorphism 
is the coordinate map between the standard bases. Next we show that in fact all vector 
spaces of dimension n are isomorphic to one another. To do so, we first require the 
notion of the inverse of a linear transformation. 


Inverse of a Linear Transformation Let V and W be vector spaces and 
T: V — W a one-to-one linear transformation. The mapping T~!: R(T) — V, 
defined by 

T ‘(w)=v_ ifandonlyif TWwW=w 
is called the inverse of T. If T is onto, then T~! is defined on all of W. 


PROPOSITION 3 


PROPOSITION 4 


Solution 


4.3 Isomorphisms 231 


By Theorem 4 of Sec. A.2, if T is one-to-one, then the inverse map is well 
defined. Indeed, let u and v be vectors in V such that T~!(w) = u and T~!(w) =v. 
Applying T gives T(T~!(w)) = T (u) and T(T~!(w)) = T (v), so that T (u) = T (v). 
Since T is one-to-one, we have u = v. 

The inverse map of a one-to-one linear transformation is also a linear transfor- 
mation, as we now show. 


Let V and W be vector spaces and T: V —> W a one-to-one linear transformation. 
Then the mapping 7~!: R(T) — V is also a linear transformation. 


Proof Let w; and w be vectors in R(T), and let c be a scalar. Also let vı and 
v2 be vectors in V with T! (w1) = vı and T! (w2) = v2. Since T is linear, 
T (evi + v2) = cT (v1) + T (v2) 
= CW + W2 
Hence, 
T! (cwı + w2) = cv + v2 
= cT! (w1) + T~! (wa) 


-1 


Consequently, T~* is a linear transformation. 


Proposition 4 shows that the inverse transformation of an isomorphism defined 
by matrix multiplication can be written using the inverse of the matrix. The proof is 
left as an exercise. 


Let A be an n x n invertible matrix and T: R” —~> R” the linear transformation 
defined by T(x) = Ax. Then T! (x) = AT. 


Let T: R? — R? be the mapping of Example 1 with T (v) = Av, where 
ee: | 
Verify that the inverse map T~!: R? — R? is given by T~!(w) = A~'w, where 
ed 
Let v = | 7 | be a vector in R*. Then 


+ = Tee | e 


=i 


232 Chapter 4 Linear Transformations 


THEOREM 11 


Solution 


Applying A7! to w, we obtain 


@ =!) Oaec> || _ | eh || -—a 
Perales ||| 


If V and W are vector spaces of dimension n, then V and W are isomorphic. 


Proof By Theorem 10, there are isomorphisms Tı: V — R” and Th: W — R”, 
as shown in Fig. 1. Let ọ = T; “Ti: V — W. To show that ¢ is linear, we first 
note that T; ' is linear by Proposition 3. Next by Theorem 2 of Sec. 4.1, the 
composition T; “T; is linear. Finally, by Theorem 4 of Sec. A.2, the mapping ~ 
is one-to-one and onto and is therefore a vector space isomorphism. 


y—  —— R" 


Th 


W 


p= T; Ti: V — W 


Figure 1 


Find an explicit isomorphism from Pz onto the vector space of 2 x 2 symmetric 
matrices $9. 


To use the method given in the proof of Theorem 11, first let 


Bı = {1,x,x7} and aoi; ee rb | all 


be ordered bases for P2 and S2x2, respectively. Let Tı and T2 be the respective 
coordinate maps from P2 and S2x2 into R?. Then 


C È 
Ti(ax?+bx+0) = | b and n(| § |- b 


4.3 Isomorphisms 233 


Observe that T; ': R? —> S2x2 maps the vector 


c 
b to the symmetric matrix | 7 i | 
a 


Thus, the desired isomorphism is given by (Ty 'oT1): Pı — S2x2 with 


3 a b 
(T; 1T) (ax? + bx +c) = | joe | 


For example, 


(Ty TA —x +2) =T, (MG? xt). zl = | x 7 | 


Fact Summary 


Let V and W be vector spaces and T a linear transformation from V into W. 
1. The mapping T is one-to-one if and only if the null space of T consists of 

only the zero vector. 

. If {vi,..., Vn} is a basis for V and T is one-to-one, then 
S = {T (v1), ..., T (Vn)} is a basis for the range of T. If T is also onto, then 
S is a basis for W. 

. Every vector space of dimension n is isomorphic to the Euclidean space R”. 

. If T is one-to-one, then T7! is also a linear transformation. 

. If V and W are both of dimension n, then they are isomorphic. 

. Let A be an n x n matrix and T(x) = Ax. Then the mapping T is an 
isomorphism if and only if A is invertible. 

. If A is an invertible matrix and T(x) = Ax, then T7! (x) = A7 !x. 


Exercise Set 4.3 


In Exercises 1—6, determine whether the linear 3. T: R? > R’, 
transformation is one-to-one. x xX+y—Z 
T = y 
1. T: R > R’, > an 
x 4x —y y 
F = 
y ý 4. T: R > R, 
A 2 2 2 2 9) 
2. T: Re > R’, i r Zx- ły- 4z 
EDE (EPLE 
7 ge toy =5% = Gy = 32 


234 Chapter 4 Linear Transformations 


5. T: Pa > Po, 
T (p(x)) = p'(x) — pœ) 
6. T: Po > P3, 


T(p(x)) = xp(x) 


In Exercises 7—10, determine whether the linear 
transformation is onto. 


7. T: RR > R?, 
x _ | 3x-y 
D- E] 
8. T: R > R?, 
ARDE a 
y X— zy 
9, T: RÈ > R’, 
x x-—y+2z 
T y = Voz 
z 2z 
10. T: R > R?, 
x 2x +3y— z 
T y = | =x + y+3z 
Z x+4y+2z 


In Exercises 11—14, T: R? — R? is a linear operator. 
Determine whether the set {T (e1), T (e2)} is a basis 
for RÊ. 


11. T: R > R, 
(G eee 
12. T: R > R, 
D-i] 
13. T: R? > R?, 


x t+ ty 
s+ 5y 


In Exercises 15—18, T: R? —> R? is a linear operator. 
Determine whether the set {T (e1), T (e2), T (e3)} is a 
basis for R3. 


15. T: RÈ > R?, 
x —x— y +2z 
T y = y =z 
Z 5z 
16. T: R? > R’, 
x 2x+3y—-— z 
T y = | 2x+6y+3z 
z 4x+9y+2z 
17. T: R > R?, 
x 4x —2y +z 
T y — 2x+2Z 
Z 2x —y t+ 3z 
18. T: RÈ > R?, 
x x—yt+2z 
T y =) ax 2y—2z 
Z —y+5z 


In Exercises 19 and 20, T: Pa —> P is a linear 
operator. Determine whether the set 
{T (1), T(x), T (x?)} is a basis for Po. 


19. T(ax? +bx +c) =(atb+c)x*+(a+b)x+a 
20. T(p(x)) = xp'(x) 


In Exercises 21—24, let T: V —> V be the linear 
operator defined by T (v) = Av. 


a. Show that T is an isomorphism. 
b. Find A™!. 


c. Show directly that T~! (w) = A~ !w for all 
weV. 


we yee aiid 


4.4 Matrix Representation of a Linear Transformation 235 


22. (aD ee d [>] 30. Show that T: P3 —> P3 defined by 
4 T (p(x)) = pœ) + pœ) + pœ) + p(x) 


is an isomorphism. 


=x 
| 
N 
© 
= 
= 


z 0 1 0 z 31. Let A be an n x n invertible matrix. Show that 
T: Mnxn > Mnxn defined by 
x 2 —l1 1 x = 
24. T y =| —-1 1 -1 y T(B) = ABA 
K 0 1 0 z is an isomorphism. 
In Exercises 25—28, determine whether the matrix 32. Find an isomorphism from M2x2 onto R4. 
mapping T: V — V is an isomorphism. 
33. Find an isomorphism from R4 onto P3. 
25. T x _.||-=3 1 X 
À y Bg L =3 y 34. Find an isomorphism from M2x2 onto P3. 
— 35. Let 
26. T x = 2] K x 
y = LLY 
V= y ||x+2y-z=0 
x 0 -1 -1 x z 
27. T y =|2 0 2 y Find an isomorphism from V onto R?. 
z 1 1 -3 Z 
36. Let ¥ 
x 1 3 0 x a 
= b R 
28. T| y| eed -1 2 =3I]y T {i 2 llencen} 
z 0 -l1 3 Z Find an isomorphism from P onto V. 
29. Show that T: Maxn > Mnxn defined by 37. Suppose T: R? —> R? is an isomorphism. Show 
T(A) = At that T takes lines through the origin to lines 
7 through the origin and planes through the origin 
is an isomorphism. to planes through the origin. 
[ 4.4 > Matrix Representation of a Linear Transformation 


Matrices have played an important role in our study of linear algebra. In this section 
we establish the connection between matrices and linear transformations. To illustrate 
the idea, recall from Sec. 4.1 that given any m x n matrix A, we can define a linear 
transformation T: R” — R” by 


T (v) = Av 
In Example 8 of Sec. 4.1, we showed how a linear transformation T: R? — R? is 
completely determined by the images of the standard basis vectors e1, e2, and e3 of R°. 


val 
The key was to recognize that a vector v= | v2 | can be written as 
U3 


V = vie + V2e2 + 033 so that T(v) = v1 T (e1) + v2T (e2) + v3T (e3) 


236 


Chapter 4 Linear Transformations 


In that example, T was defined so that 


rey= |; |.rer=| | rey=(1 | 


Now let A be the 2 x 3 matrix whose column vectors are T(e;), T(e2), and T (e3). 
Then 
1 -1 0 
wel z [ees 
That is, the linear transformation T is given by a matrix product. In general, if 
T: R” — R” is a linear transformation, then it is possible to write 


T (v) = Av 


where A is the m x n matrix whose jth column vector is T(e;) for j = 1,2,...,n. 
The matrix A is called the matrix representation of T relative to the standard 
bases of R” and R”. 

In this section we show that every linear transformation between finite dimensional 
vector spaces can be written as a matrix multiplication. Specifically, let V and W be 
finite dimensional vector spaces with fixed ordered bases B and B’, respectively. If 
T: V —> W is a linear transformation, then there exists a matrix A such that 


[T(v)]e = Alv] 


In the case for which V = R”, W = R”, and B and B’ are, respectively, the standard 
bases, the last equation is equivalent to 


T(v) = Av 
as above. We now present the details. 
Let V and W be vector spaces with ordered bases B = {v1, V2,..., Vn} and B’ = 
{W1, W2,..., Wm}, respectively, and let T: V —> W be a linear transformation. Now 


let v be any vector in V and let 


be the coordinate vector of v relative to the basis B. Thus, 
V = C1 V1 + C2V2 + +++ + Cnn 
Applying T to both sides of this last equation gives 


T(v) = T (c1V1 + c2V2 +--+ + CnVn) 
= cT (v1) + oT (V2) +--+ + nT (Vn) 


Note that for each i = 1,2,..., the vector T(v;) is in W. Thus, there are unique 
scalars a;; with 1 <i < m and 1 < j < n such that 


THEOREM 12 


4.4 Matrix Representation of a Linear Transformation 237 


T (v1) = auWy + a21W2 + +++ + ami Wn 
T (V2) = 412W1 + a22W2 + +++ + Gm2Win 


T (Vn) = GinW1 + G2nW2 + +++ + amnWm 
Thus, the coordinate vectors relative to the ordered basis B’ are given by 
aji 
a . 
[T (v)]g = : fori=1,2,...,n 
Ami 


Recall from Example 7 of Sec. 4.1 that the coordinate map defines a linear transfor- 


mation. Thus, the coordinate vector of T (v) relative to B’ can be written in vector 
form as 


ail a12 ain 
a21 an2 an 
IT) lp =c : +c +e ten : 
Am1 Am2 Amn 
or in matrix form as 
a&i an ain Cl 
an an an C2 
[T 0)]y = : 
Ami Am2 .-- Amn Cn 


The matrix on the right-hand side of the last equation is denoted by (rie with 


Tk = || Tw TOD | +++ | TO) 
BI BI BI 
We call (re: the matrix of T relative to B and B’. In the case for which T: V —> V 


is a linear operator and B is a fixed ordered basis for V, the matrix representation for 
the mapping T is denoted by [T]g. 
The preceding discussion is summarized in Theorem 12. 


Let V and W be finite dimensional vector spaces with ordered bases B = {vj, 
V2,.--Vn} and B’ = {w1, W2, . . . Wm}, respectively, and let T: V —> W be a linear 
transformation. Then the matrix aks is the matrix representation for T relative to 
the bases B and B’. Moreover, the coordinates of T (v) relative to B’ are given by 


[T (v)]e = (712 le 


238 Chapter 4 Linear Transformations 


Suppose that in Theorem 12 the vector spaces V and W are the same, B and B’ 
are two different ordered bases for V, and T: V —> V is the identity operator, that 
is, T(v) = v for all v in V. Then eae is the change of bases matrix (18 , given in 
Sec. 3.4. 


| EXAMPLE1 | Define the linear operator T: R? — R? by 


X 
T y =| =y 
g g 


a. Find the matrix of T relative to the standard basis for R°. 
b. Use the result of part (a) to find 


T 1 


Solution a. Let B = {e1, e2, e3} be the standard basis for R*. Since 


1 0 0 
[T (e)z = | 0 [T (e2)]3 = | —1 [T (e3) = | 0 
(0) 0 1 
then 
1 0 0 
uh O = 0 
(0) 0 1 


b. Since B is the standard basis for R*, the coordinates of any vector are given 
by its components. In this case, with 


1 1 
ve || ll then [vl =} 1 
Thus, by Theorem 12, 
1 0 0 1 1 
R T(v) =[TW)]lg =| 0 —1 0 1 |=] il 
z 0 0 1 2 2 
Notice that the action of T is a reflection through the xz plane, as shown in 
Fig. 1. 


Figure 1 


The following steps summarize the process for finding the matrix representation 
of a linear transformation T: V —> W relative to the ordered bases B and B’. 


4.4 Matrix Representation of a Linear Transformation 239 


1. For the given basis B = {v1, V2, ..., Vn}, find T(v1), T(v2),..., T (Vn). 

2. Find the coordinates of T(v,), T(v2),...,7(v,) relative to the basis B’ = 
{W1, W2,..., Wm} of W. That is, find [T (vi) gp, [T (v2) ] ep, .--, [T (Wn) |B. 

3. Define the m x n matrix (T18 with ith column vector equal to [T (v;)] x’. 


A 


. Compute [v]z. 
5. Compute the coordinates of T (v) relative to B’ by 


[T (v) le = (712 [v] = 


6. Then T(v) = cy wy + CoW2 +--+ + CmWm. 


| EXAMPLE2 | Let T: R? — R? be the linear transformation defined by 
x i 
rv=r(| ' |) = | xı +X2 
X2 


X1 — X2 


{bE eth] a] 


be ordered bases for R? and R$, respectively. 


and let 


a. Find the matrix [T]8’. 


b. Let v = | E | . Find T (v) directly and then use the matrix found in part (a). 


Solution a. We first apply T to the basis vectors of B, which gives 


p 1 
e(a D=] 3] mE 
—1 2 
Next we find the coordinates of each of these vectors relative to the basis B’. 
That is, we find scalars such that 


1 1 1 2; 
a 0|+a@ 1 + a3 1 = 3 
0 0 1 —1 
and 
1 1 1 1 


240 Chapter 4 Linear Transformations 


The solution to the first linear system is 
ay = —1 a= 4 az = —1 


and the solution to the second system is 


Thus, 
=e 
(T]2 = mM 9 
=f 2 


b. Using the definition of T directly, we have 


=) =9 
7 (| =3 2 || = || —5 
—34+2 =i 


Now, to use the matrix found in part (a), we need to find the coordinates of 


v relative to B. Observe that the solution to the equation 


1 3 =3 ; 3 4 
ay 2 + a2 1 = a) 1S et “aa 


Thus, the coordinate vector of | a | relative to B is 


Eo 


We can now evaluate T, using matrix multiplication, so that 


-1 -3 -8 3 
—1 2 BS —1 
Hence, 
1 1 1 —2 
T(v) =3 | 0 | -4 es i es 
0 0 1 —1 


which agrees with the direct computation. 


| EXAMPLE3 | Define a linear transformation T: P2 —> P; by 
T (f(x) = x7 f") — 2f'@) Haf E) 


Find the matrix representation of T relative to the standard bases for P2 and P3. 


4.4 Matrix Representation of a Linear Transformation 241 


Solution Since the standard basis for P2 is B = {1, x, xh. we first compute 
T() =x Ta)=x*-2 TRA=XrO—200400 ee 


Since the standard basis for P3 is B’ = {1, x, x7, x°}, the coordinates relative to 


B’ are 
0 =? 0 
1 0 4 
TO =| 9 TOn ||. ad (7 @ lp ae 
0 0 1 
Hence, the matrix of the transformation is given by 
Q = 0 
' 1 0 —4 
[Tle = | a 
0 0 1 
As an example, let f(x) = x? — 3x + 1. Since f'(x) = 2x — 3 and f" (x) = 2, 


we have 


T (f(x) =x?) — 2(2x — 3) + x(x? — 3x + 1) 
=<) E aó 
Using the matrix representation of T to find the same image, we observe that 


1 


fF] = | —3 
1 


The coordinates of the image of f(x) under the mapping T relative to B’ are then 


given by 
0 —2 O i 6 
TCO = rR 6 7 > ))-3)=] 23 
0 O 1 > 1 


The image T(f(x)) is the linear combination of the monomials in B’ with coeffi- 
cients the components of [T (f (x))]x, that is, 


T (f(x) =60) -30 —x* +43 =x? a — 3x 6 


This agrees with the direct calculation. 


In Sec. 4.1 we discussed the addition, scalar multiplication, and composition of 
linear maps. The matrix representations for these combinations are given in a natural 
way, as described by Theorems 13 and 14. The proofs are omitted. 


242 Chapter 4 Linear Transformations 


THEOREM 13 


Solution 


THEOREM 14 


Let V and W be finite dimensional vector spaces with ordered bases B and B’, 


respectively. If S and T are linear transformations from V to W, then 
1. [S+ TI} = [S15 +1715 
2% KTI? = kIT]? for any scalar k 


As before in the special case for which S and T are linear operators on a finite 
dimensional vector space V, and B is a fixed ordered basis for V, the notation becomes 


[S+ T]s = [S]s + [T]s and [kT]g = k[T]z. 


Let S and T be linear operators on R? with 


(Deia =e 
y =y y 3x 
If B is the standard basis for R?, find [S + T]g and [3S]z. 


The matrix representations for the linear operators S and T are, respectively, 


[Sin = seen S(e2) a Ei 
B B 
and 
T]g = T T =| ae 
(Tls=|| Ted e» | |=173 4 
B B 


Then by Theorem 13, 


srie o 4 |+| a gaa 


i 2 3 6 
Bss =3| 4 E 2i 


and 


As we mentioned in Sec. 4.1, the matrix of the composition is the product of the 


matrices of the individual maps, as given in Theorem 14. 


Let U, V, and W be finite dimensional vector spaces with ordered bases B, B’, and 
B", respectively. If T: U > V and S: V —> W are linear transformations, then 


[SoT]8" = [S15 (712 


4.4 Matrix Representation of a Linear Transformation 243 


Again, if S and T are linear operators on a finite dimensional vector space V, 
and B is a fixed ordered basis for V, then 


[S°T]s = [S]aIT Iz 


Repeated application of Theorem 14 gives the following result. 


COROLLARY 1 Let V be a finite dimensional vector space with ordered basis B. If T is a linear 
operator on V, then 
[T"]s = (T]p)" 


| EXAMPLE 5 | Let D: P3 — P3 be the linear operator defined by 
D(p(x)) = p'(x) 
a. Find the matrix of D relative to the standard basis B = {1, x, x, x°}. Use the 
matrix to find the derivative of p(x) = 1 — x + Dee 


b. Find the matrix needed to compute the second derivative of a polynomial in 
P3. Use this matrix to find the second derivative of p(x) = 1 — x + 2x3. 


Solution a. By Theorem 12, we have 


[D]z = Day D(x) D(x’) D(x?) 
B B B B 
0 1 0 0 
Sy O 0 2 ¢ 
Bio © © 2 
0 0 0 O 
Since the coordinate vector of p(x) = 1 — x + 2x3, relative to B, is given by 
1 
Heine |. 
pP X Bae 0 
2 
then 
0 100 —1 
0 0 2 0 —1 0 
0 0 0 0 2 0 


244 Chapter 4 Linear Transformations 


Therefore, as expected, D(p(x)) = —1 + 6x?. 
b. By Corollary 1, the matrix we need is given by 


0 0 2 0 

a 2_|90 0 0 6 

0 0 0 0 

If pw) =1l—-x+ 2x, then 

0 0 2 0 1 0 
2 _|0 0 0 6 = || | 2 
[D Pla] 9 9 9 0 0 | = ene 
0 0 0 0 2 0 


so that p”(x) = 12x. 


The final result of this section describes how to find the matrix representation of 
the inverse map of an invertible linear operator. 


COROLLARY 2 Let T be an invertible linear operator on a finite dimensional vector space V and 
B an ordered basis for V. Then 


[T7]; = (Tla) 
Proof Since T~!oT is the identity map, by Theorem 14, we have 
[Ne = (ToT ]s = [T7']s[T]s 
Since [/]z is the identity matrix, [Tt] = (Tle). 


Fact Summary 


Let V and W be vector spaces, B = {v,..., Vn} and B’ = {wy,..., Wm} ordered 
bases of V and W, respectively, and 7 a linear transformation from V into W. 


1. The matrix of T relative to B and B’ is given by 
[Tig = [IT (VD]e (Ta) Ie «++ (Twn) 81 


2. If v is a vector in V, the coordinates of T (v) relative to the basis B’ can be 
computed by ; 
IT] = IT Ip [vls 


4.4 Matrix Representation of a Linear Transformation 245 


3. To find T(v) multiply each basis vector in B’ by the corresponding 
component of [T (v)]g’. That is, if [T(v)]p = [bi b} ... by, |’, then 


T(v) = biwi + bw +--+ + OW 


4. If S is another linear transformation from V into W, then the matrix 
representation of S + T relative to B and B’ is the sum of the matrix 
representations for S and T. That is, [S + 7]8 = [S] + [7]. 

5. If c is a scalar, then to find the matrix representation of cT relative to B and 
B’, multiply the matrix representation for T by c. That is, [cT] = c[T]8.. 


6. If S is a linear transformation from W into Z and B” is an ordered basis 
for Z, then [S°T]} = [S]}, [T]Ż. 

Te (UP Ns = C Ih) 

8. If T is invertible, then [T~!]g = ((T]z) 1. 


Exercise Set 4.4 


In Exercises 1-4, T: R” — R” is a linear operator. 4. T: R? > R?, 
a. Find the matrix representation for T relative to the x x 
standard basis for R”. T y = y 
b. Find T(v), using a direct computation and using z =z 
the matrix representation. ; 
1. T: R? > R’, ya] 5 


In Exercises 5-12, T: V — V is a linear operator with 


2. T: RR > R2 B and B’ ordered bases for V. 
T (| ` }) = | as | a. Find the matrix representation for T relative to 
y y the ordered bases B and B’. 
—1 
YS 3 | b. Find T (v), using a direct computation and 
using the matrix representation. 
3. T: R? > R?, 
x —x+y42z 5. T: R > R?, 
T y = 3y +z 7 —x + 2y 
Z x-z T y = 3x 


1 
v= | -2 B 


: “{L-t} Lo} 


246 Chapter 4 Linear Transformations 


Belha] Bal, 9. T: Pa > Pa, 
: l T (ax? +bx +c) =ax? +bx +c 
v=] 3 B ={1,1— x, (1 — x)}) 
B' = 1 2 
6. T: RÈ > R?, Haas] 
v=x — 3x +3 
x 2% = Z 
T y =| -2474 10. T: P2 > Po, 
z oe T(p(x)) = p'a) + pa) 
—1 1 1 2 2 
B={1—x-— x^,1,1 
B= ol l2ļ, 12 E raed 
1 0 1 B' = {1 +x, —1 +x +a’, x} 
1 0 0 v=l1-x 
B= : i : : 11. Let 
1 0 
1 se | 0 -1 | 
ve oat and let T be the linear operator on all 2 x 2 
l matrices with trace 0, defined by 
7. T: R > RÈ, T(A)= AH- HA 
Te -dlo flo opli ol) 
T = a = ’ ’ 
Oi lees ={Lo a} fo of lr o 
—1 1 Ba 
B=5| 4/1] 4 ae ee 
~ 13 -2 
Ee E 
-2 -2 12. T: Mo,.2 => Mo x2, 
—1 
aa T(A)=2A' +A 
8. T: RÈ > RÈ, B and B’ the standard basis on M2x2 
1 3 
x x+z YE ea 
T y = | 2y- x 
z yrs 13. Let T: R? — R? be the linear operator defined by 
—1 -1 0 
Se (E D-3] 
1 1 1 2 2 
0 1 =f Let B be the standard ordered basis for R? and B’ 
B= | 0 | 0 —1 the ordered basis for R? defined by 
1 -1 0 
reile bal 
—2 2 —1 
v= 1 a. Find [T]g. 
3 b. Find [T]. 


14. 


15. 


c. Find [T]Ż. 

d. Find [TIS 

e. Let C be the ordered basis obtained by 
switching the order of the vectors in B. Find 
[T]Ë. 

f. Let C’ be the ordered basis obtained by 
switching the order of the vectors in B’. Find 
[rye 


Let T: R? — R? be the linear transformation 


defined by 
x—y 
gE) a 
4 x + 2y 
Let B and B’ be ordered bases for R? and B” the 
ordered basis for R? defined by 


Ph bla) 
rao Ri 


a. Find [T]Ż". 

b. Find [T]5;. 

c. Let C be the ordered basis obtained by 
switching the order of the vectors in B. Find 
[T]é. 

d. Let C’ be the ordered basis obtained by 
switching the order of the vectors in B’. Find 
aes 

e. Let C” be the ordered basis obtained by 
switching the order of the first and third 
vectors in B”. Find (T]S". 


Let T: Pı — P2 be the linear transformation 
defined by 
b 5 
T(a+ bx) = ax + td 


Let B and B’ be the standard ordered bases for 
Pı and P3, respectively. 
a. Find [T]8’. 


4.4 Matrix Representation of a Linear Transformation 


16. 


17. 


247 


b. Let C be the ordered basis obtained by 
switching the order of the vectors in B. Find 
ITE. 

c. Let C’ be the ordered basis obtained by 
switching the first and second vectors in B’. 
Find [T]¢. 

d. Define S: Pz — Pı by 


S(a + bx + cx?) = b + 2cx 


Find [S]#,. 
e. Verify that [S]3,[7]% = I, but that 
[T]; (lp # 1. 
f. Interpret the statement 
[S]IŻ [T]? = I 


in terms of the functions T and S. 


Define a linear operator T: Mox2 — M2x2 by 


a eee es 


Let B be the standard ordered basis for M>,.2 and 
B’ the ordered basis 


ratte rir o| 


[aaki ah 
=i e E a 

a. Find [T]Ż'. 

b. Find [T]Ż},. 

c. Find [T]. 

d. Find [7] and [7]2,. 

e. Verify that 


(718 = (Tp 
[T] = T] 


Define a linear operator T: R? —> R? by 


S 


Find the matrix for T relative to the standard 


basis for R?. Describe geometrically the action of 
T on a vector in R?. 


248 Chapter 4 Linear Transformations 


18. Define a linear operator T: R? —> R? by 


T x _ | cosð— sinð x 
y ~ | sin@ cosé y 
Describe geometrically the action of T on a 


vector in R?. 


19. Let c be a fixed scalar and define T: R” > R” by 


X1 X1 
X2 X2 
T : =C ; 
Xn Xn 


Find the matrix for T relative to the standard 
basis for R”. 


20. Define T: M2x2 > M2x2 by 
T(A)=A-— ÁA' 


Find the matrix for T relative to the standard 
basis for Moy. 


21. Define T: M2x2 > R by 
T(A) = tr(A) 


Find the matrix (Tie, where B is the standard 
basis for M2x2 and B’ = {1}. 


In Exercises 22-25, let S, T: R? — R? be defined by 


[Delta] 
(5 D= [sts] 


a. Find the matrix representation for the given 
linear operator relative to the standard basis. 


and 


b. Compute the image of v = E directly 


and using the matrix found in part (a). 
22. —3S 
23. 2T+S 
24. ToS 
25. SoT 


In Exercises 26-29, let S, T: R? —> R? be defined by 


x x-y-Zz 
Z a a 
and 
x 3x —Z 
S y = x 
Zz Z 


a. Find the matrix representation for the given 
linear operator relative to the standard basis. 


b. Compute the image of 
-1 


directly and using the matrix found in part (a). 
26. 2T 
27. —3T +28 
28. ToS 
29. SoT 
30. Let B be the basis for R? defined by 


ge fold, 


If T: R? — R? is the linear operator defined by 


x = 9x— Sy 
bess 
find the matrix for T*, for k > 1, relative to the 


basis B. 
31. Define T: P4 —> Pa by 


T (p(x)) = p(x) 


Find the matrix for T relative to the standard 
basis for P4. Use the matrix to find the third 
derivative of p(x) = —2x4 — 2x3 + x? — 2x — 3. 


32. Let T: P2 —> P2 be defined by 
T (p(x)) = p(x) + xp'(x) 


Find the matrix [T]g where B is the standard 
basis for P>. 


4.5 Similarity 249 


33. Let S: P2 > P3 and D: P3 > P2 be defined by reflects a vector v through the line 
perpendicular to v1. 

S(p(x)) = xp(x) 

35. Let A be a fixed 2 x 2 matrix and define 

and ae ia T: Mox2 > Mo x2 by 
x)) = p(x 
: ; ne a B T(B)= AB — BA 
Find the matrices [S] and [D]ġ, where 


B = {1, x, x?} and B’ = {1, x, x2, x3}. Observe Find the matrix for T relative to the standard 
that the operator T in Exercise 32 satisfies basis for M2x2. 
T = DoS. Verify Theorem 14 by showing that 36. Let B = {v1, v2, v3} and B’ = {v2, V1, v3} be 
[T]z = [Dp [S] : ordered bases for the vector space V. If 
34. a. Define a basis for R2 by T: V V is defined by T(v) = y, then find 
eae . Describe the relationship between [v]g and 
B= { | 1 | | 0 \ [v]g and the relationship between the identity 
1p |1 matrix J and [ties 
Find [T]g where T: R? —> R? is the linear 37. Let V be a vector space and B = {v1, V2,..., Vn} 
operator that reflects a vector v through the be an ordered basis for V. Define v9 = 0 and 
T: V > V by 


line perpendicular to 
b. Let B = {v1, v2} be a basis for R2. Find [T]g EOE NE Viet, CAS gn 
where T: R? — R? is the linear operator that Find [T]z. 


4.5 > Similarity 


We have just seen in Sec. 4.4 that if T: V — V is a linear operator on the vector 
space V, and B is an ordered basis for V, then T has a matrix representation relative 
to B. The specific matrix for T depends on the particular basis; consequently, the 
matrix associated with a linear operator is not unique. However, the action of the 
operator T on V is always the same regardless of the particular matrix representation, 
as illustrated in Example 1. 


| «EXAMPLE 1 | Let T: R? — R? be the linear operator defined by 
T xX 4 x+ y 
y = ey ap ayy 
Also let Bı = {e;, e2} be the standard basis for R? and let By = { | l | : | l | \ 


be a second basis for R?. Verify that the action on the vector v = | ; | by the 


operator T is the same regardless of the matrix representation used for T. 


250 


Chapter 4 Linear Transformations 


Solution 


THEOREM 15 


The matrix representations for T relative to Bı and B2 are 


irie A and mia 


respectively. Next, observe that 


mn =| 5 | and mn =| 4 | 


Applying the matrix representations of the operator T relative to Bı and B2, we 


obtain an 3 3 
[TOs = Taa = | 3 aA 


TOs, = (latvia =| 5 3|[4]=| 3 | 


To see that the result is the same, observe that 


ro) =5| 9 [#87 em ad T(v) = HE n- 


and 


Theorem 15 gives the relationship between the matrices for a linear operator 
relative to two distinct bases. 


Let V be a finite dimensional vector space, Bı and Bz two ordered bases for V, 
and T: V — V a linear operator. Let P = [I in be the transition matrix from B2 
to B,. Then 

[T]s, P AT lee 


Proof Let v be any vector in V. By Theorem 12 of Sec. 4.4, we have 
IT (v)]a, = [T]a,[vla, 
Alternatively, we can compute [T (v)]g, as follows: First, since P is the transition 
matrix from Bə to By, 
[vl], = Plvip, 
Thus, the coordinates of T(v) relative to Bı are given by 
[T (v)]B, = [T ]B, [vlz, = [T]s, Plvlp, 


Now, to find the coordinates of T(v) relative to B2, we multiply on the left by 
P—!, which is the transition matrix from B, to B2, to obtain 


[T(v)]p, = P'IT ]n, Plvle, 


Since both representations for [T(v)]g, hold for all vectors v in V, then [T]z, = 
PT a. See Fig. 1. 


4.5 Similarity 251 


v] —________, [T(v)],, 


7 [T ]z, 
P te 
[T ]z, M 
LV] 5, > [T(v)]s, 
Figure 1 


| EXAMPLE2 | Let T, Bı, and B2 be the linear operator and bases of Example 1. Then 


1 1 
[T |p, = | Bey) 7 | 
Use Theorem 15 to verify that 


u, ] 


Solution Since B, is the standard basis for R?, by Theorem 14 of Sec. 3.4 the transition 
matrix from B» to Bı is 


a 
By 


and hence 


Then 


E l l i 1 1 
P-er a Fill 2 


| EXAMPLE3 | Let T: R? — R? be the linear operator given by 
T x |= +2y 
y 3x+ y 


and let 


252 


Chapter 4 Linear Transformations 


Solution 


DEFINITION 1 


be ordered bases for R?. Find the matrix of T relative to B4, and then use Theorem 15 
to find the matrix of T relative to Bo. 


r((2)-[4] = r((S)-[4] 
mm =([-$], E 4 


The transition matrix from Bz to Bı is 


p= tik =|[ E a a] 


Therefore, by Theorem 15 


Since 


we have 


In general, if the square matrices A and B are matrix representations for the same 
linear operator, then the matrices are called similar. Using Theorem 15, we can define 
similarity for square matrices without reference to a linear operator. 


Similar Matrices Let A and B be n x n matrices. We say that A is similar to 
B if there is an invertible matrix P such that B = P~!AP. 


The notion of similarity establishes a relation between matrices. This relation is 
symmetric; that is, if the matrix A is similar to the matrix B, then B is similar to A. 
To see this, let A be similar to B; that is, there is an invertible matrix P such that 


B = P'AP 
Now let Q = P™!, so that B can be written as 


B = QAQ! 


Hence, A = Q7! BQ, establishing that B is similar to A. For this reason we say that 
A and B are similar if either A is similar to B or B is similar to A. In addition, 
the relation is reflexive since any matrix is similar to itself with P being the identity 
matrix. This relation is also transitive; that is, if A is similar to B and B is similar 
to C, then A is similar to C. See Exercise 17. Any relation satisfying these three 
properties is called an equivalence relation. 


4.5 Similarity 253 


Fact Summary 


Let V be a finite dimensional vector space, Bı and B ordered bases of V, and T 
a linear operator on V. 


1. The matrix representations [T]z, and [7 ]g, are similar. That is, there is an 


invertible matrix P such that [T]z, = PIT] g, P. In addition, the matrix 
P is the transition matrix from B» to By. 

2. A matrix is similar to itself. If A is similar to B, then B is similar to A. If 
A is similar to B and B is similar to C, then A is similar to C. 


Exercise Set 4.5 ji 


In Exercises 1 and 2, [T]g, is the matrix representation of T relative to the bases By, 
representation of a linear operator relative to the basis and B2. 
Bı, and [T ]pg, is the matrix representation of the same y x+y 
operator relative to the basis B2. Show that the action 3. T (| D = | | 
of the operator on the vector v is the same whether 4 


using [T]g, or [T]z,. Bı = {e1, e2} 
1 —1 
MEOE Si 
1. [T]s; T | —1 3 [iri E | —1 J |: 1 1 
it [2] 
= ~ | —2 
m= tol] 
ees (eee 
m= ti}l ol} rls 
v=| a] By = {e1, e2} 
: m= {i} [2 |} 
A ’ 
0 1 —2 0 —1 2 
WS | ae 
B= 1 0 
te ofj 1 x x 
1 l 5. T y =| 0 
Bı = 2 > 1 Z Zz 
afs Bı = {e1, e2, e3} 
SFS 1 -1 0 
In Exercises 3—6, a linear operator T and bases B, B = 0 |, 1 |, | 0 
and B are given. 1 0 1 
a. Find [T]g, and [T]g,. 1 
b. Verify that the action on v of the linear Y= : 


operator T is the same when using the matrix 


254 Chapter 4 Linear Transformations 


x x+y 
6. T y =|x-ytz 
Z y-z 


Bı = {e1, e2, e3} 


—1 0 1 
B = 1 |,/ 0 |,] 0 
0 1 1 
2 
v= | -l 
—1 


In Exercises 7-10, [T]z, and [T]g, are, respectively, 
the matrix representations of a linear operator relative 
to the bases Bı and B2. Find the transition matrix 
P=[I ae and use Theorem 15 to show directly that 
the matrices are similar. 


© — 
— 
ip. 
— 
w 
N 
| 
—— 
l 

oO A 
Ree 
eer 


a E 
n-i ahlil 


In Exercises 11—14, find the matrix representation of 
the linear operator T relative to Bı. Then use 
Theorem 15 to find [T]z,. 


mrs S 


Bı = {e1, e2} 


15. Let T. P2 — P be the linear operator defined 
by T(p(x)) = p’(x). Find the matrix 
representation [T]g, relative to the basis 
Bı = {1, x, x7} and the matrix representation 
[T]p, relative to B2 = {1, 2x, x* — 2}. Find the 


transition matrix P = [7 lps and use Theorem 15 


4.6 Application: Computer Graphics 255 


to show directly that the matrices [T]g, and [T], 18. Show that if A and B are similar matrices, then 
are similar. det(A) = det(B). 

- Let T: P2 — P3 be the linear operator defined 19. Show that if A and B are similar matrices, then 
by T(p(x)) = xp’ (x) + p” (x). Find the matrix tr(A) = tr(B). 
representation [T ]g, relative to the basis 
Bı = {1, x, x7} and the matrix representation 20. Show that if A and B are similar matrices, then 
[T]p, relative to By = {1, x, 1 + x7}. Find the A!’ and B‘ are similar matrices. 


transition matrix P = [I ee and use Theorem 15 
to show directly that the matrices [T]g, and [T]z, 
are similar. 


21. Show that if A and B are similar matrices, then 
A” and B” are similar matrices for each positive 


integer n. 
. Show that if A and B are similar matrices and B 
and C are similar matrices, then A and C are 22. Show that if A and B are similar matrices and ^ 
similar matrices. is any scalar, then det(A — XJ) = det(B — XJ). 


4.6 > Application: Computer Graphics 


The rapid development of increasingly more powerful computers has led to the explo- 
sive growth of digital media. Computer-generated visual content is ubiquitous, found 
in almost every arena from advertising and entertainment to science and medicine. 
The branch of computer science known as computer graphics is devoted to the study 
of the generation and manipulation of digital images. Computer graphics are based 
on displaying two- or three-dimensional objects in two-dimensional space. Images 
displayed on a computer screen are stored in memory using data items called pixels, 
which is short for picture elements. A single picture can be comprised of millions 
of pixels, which collectively determine the image. Each pixel contains informa- 
tion on how to color the corresponding point on a computer screen, as shown in 
Fig. 1. If an image contains curves or lines, the pixels which describe the object 
may be connected by a mathematical formula. The saddle shown in Fig. 1 is an 
example. 


Figure 1 


256 


Chapter 4 Linear Transformations 


eT eT rT CT Te TT 


Figure 2 


Figure 3 


Graphics Operations in R? 


To manipulate images, computer programmers use linear transformations. Most of the 
examples we consider in this section use linear operators on R*. One of the proper- 
ties of linear transformations that is especially useful to our work here is that linear 
transformations map lines to lines, and hence polygons to polygons. (See Exercise 10 
of the Review Exercises for Chapter 4.) Therefore, to visualize the result of a linear 
transformation on a polygon, we only need to transform the vertices. Connecting the 
images of the vertices then gives the transformed polygon. 


Scaling and Shearing 


A transformation on an object that results in a horizontal contraction or dilation 
(stretching) is called a horizontal scaling. For example, let T be the triangle shown 
in Fig. 2 with vertices (1, 1), (2, 1), and (3, $): Suppose that we wish to perform 
a horizontal scaling of T by a factor of 3. The transformed triangle T’ is obtained 
by multiplying the x coordinate of each vertex by 3. Joining the new vertices with 
straight lines produces the result shown in Fig. 3. 

The linear operator S: R? — R? that accomplishes this is given by 


s Ani 
y y 
To find a matrix representation of S, let B = {e1, e2} be the standard basis for R2. 
Then by Theorem 12 of Sec. 4.4, we have 
3 0 
[S]s = [ [S(e1)] [S(e2)] ] = 01 


Let v; and v;, for i = 1, 2, and 3, be, respectively, the vertices (in vector form) of T 
and T”. Since the coordinates of the vertices of T are given relative to B, the vertices 
of T’ can be found by matrix multiplication. Specifically, 


BABEL sB NEA 
f- 


These results are consistent with the transformed triangle T’, as shown in Fig. 3. 
In general, a horizontal scaling by a factor k is given by the linear transformation 


ea DL] 


and 


< 

we 
‘ —_———_d 
O Ww 
p © 
| O | 
j 
W NIY% 


Solution 


Figure 4 


4.6 Application: Computer Graphics 257 


The matrix representation of S}, relative to the standard basis for R*, is given by 


(ile =| d 


Similarly, a vertical scaling is given by the linear operator 


sG D-le] 


The matrix representation of S,, relative to the standard basis for RÊ, is given by 


S=] 4 z| 


If both components are multiplied by the same number k, then the result is called a 
uniform scaling. In all the above cases, if k > 1, then the transformation is called a 
dilation, or stretching; and if 0 < k < 1, then the operator is a contraction. 


Let T denote the triangle with vertices given by the vectors 


[2] ==[F] v[3] 


as shown in Fig. 4. 


a. Stretch the triangle horizontally by a factor of 2. 
b. Contract the triangle vertically by a factor of 3. 


c. Stretch the triangle horizontally by a factor of 2, and contract the triangle 
vertically by a factor of 3. 


a. To stretch the triangle horizontally by a factor of 2, we apply the matrix 


2 0 
0 1 
to each vertex to obtain 


salt] [i] [2] 


Connecting the new vertices by straight-line segments gives the triangle T’ 
shown in Fig. 5(a). 


b. To contract the triangle vertically by a factor of 3, we apply the matrix 


to each vertex to obtain 


s[i] 
3 


258 Chapter 4 Linear Transformations 


Figure 6 


The contracted triangle T” is shown in Fig. 5(b). 


Ay Ay 
SFF Si 
JL 7! a 
+ Jb T" 
WA aS, L 
i 1 tet SS Se 
= JL 5 X =5 dL 5 x 
=AL -5+ 
(a) (b) 
Figure 5 


c. This operator is the composition of the linear operators of parts (a) and (b). 


By Theorem 14 of Sec. 4.4, the matrix of the operator, relative to the standard 
basis for RŽ, is given by the product 


[o stio aN 


Applying this matrix to the vertices of the original triangle gives 


4 2 
s] ei] E 
3 3 


as shown in Fig. 6. 


Another type of transformation, called shearing, produces the visual effect of 
slanting. The linear operator S: R? + R? used to produce a horizontal shear has the 
form 


(Ls Dale] 


where k is a real number. Relative to the standard basis B, the matrix representation 
of S is given by 


(Se =| d 


; : : ; . . 0 
As an illustration, let T be the triangle of Fig. 7(a) with vertices vı = | 0 |: 


v= | : | and v3 = | ' } and let k = 2. After applying the matrix 


s=] 4 i 


4.6 Application: Computer Graphics 259 


to each of the vertices of T, we obtain vi = | i |» = | z | and v3 = | f | 
The resulting triangle T’ is shown in Fig. 7(b). 


AY AY 
5ST 5+ 
eS T e T' 
HN oee RERE AEE 
=5 -51 
(a) (b) 
Figure 7 


A vertical shear is defined similarly by 


(5 DL sie] 


In this case the matrix for S, relative to the standard basis B, is given by 


s=; 4 


| EXAMPLE2 | Perform a vertical shear, with k = 2, on the triangle of Fig. 2. 


Solution The matrix of this operator, relative to the standard basis for R?, is given by 


st 
+ p Applying this matrix to the vertices 
+ N 1 2 3 
EOE lian Vie 1 vV = 1 Vey 3 
T we obtain 
[ = 1 ve 2 ye 3 


Figure 8 The images of the original triangle and the sheared triangle are shown in Fig. 8. 


260 Chapter 4 Linear Transformations 


Reflection 


The reflection of a geometric object through a line produces the mirror image of the 
object across the line. The linear operator that reflects a vector through the x axis is 


o UE 


A reflection through the y axis is given by 


(Ls D-[5] 


and a reflection through the line y = x is given by 


s- ([ D-E 


The matrix representations, relative to the standard basis B, for each of these are given 


by 
Rissa zl R= | o i (Reade =| a 


| EXAMPLE3 | Perform the following reflections on the triangle T of Fig. 4. 


a. Reflection through the x axis. 
b. Reflection through the y axis. 
c. Reflection through the line y = x. 


Solution a. The vertices of the triangle in Fig. 4 are given by 


n-e] eel] Heli 


Applying the matrix [R,]g to the vertices of the original triangle, we obtain 


“=| 2] 4=[4] 4=[4] 


The image of the triangle is shown in Fig. 9(a). 
b. Applying the matrix [R,]g to the vertices of the original triangle, we obtain 


[i] e | 


The image of the triangle with this reflection is shown in Fig. 9(b). 
c. Finally, applying the matrix [Rx=y]g to the vertices of the original triangle, 


we obtain 
/ 1 ri 1 Fi 3 
EEO HE || 2 S=] i 


The image of the triangle is shown in Fig. 9(c). 


Solution 


4.6 Application: Computer Graphics 261 


ay Ay AY 
5 Sap 5 
orotate ae je a Le aa 
zal 5] -5+ 
(a) (b) (c) 
Figure 9 


Reversing Graphics Operations 


The operations of scaling, shearing, and reflection are all reversible, and hence the matrix 
representations for each of these operators are invertible. By Corollary 2 of Sec. 4.4, to 
reverse one of these operations, we apply the inverse matrix to the transformed image. 


Let S be the linear operator that performs a reflection through the line y = x, 
followed by a horizontal stretching by a factor of 2. 


a. 
b. 


a. 


Find the matrix representation of S, relative to the standard basis B. 

Find the matrix representation of the reverse operator, again relative to B. 
Using the matrices given above for these operations, and by Theorem 14 of 
Sec. 4.4, the matrix of the transformation, relative to the standard basis for R2, 
is given by the product 


se=[5 1 |[1 of=[1 o] 


. By Corollary 2 of Sec. 4.4, the matrix which reverses the operation of part (a) 


is given by 


NIE © 
(ep 1 
Li 


= = l o =2 | _ 
[S7 ]s = ([S]z) =-3|{ J= 


As we noted in Example 4(a), if a graphics operation S is given by a sequence 


of linear operators $1, S2,..., Sn, then 


S = Sno Sn-1° +++ 0S] 


The matrix representation, relative to the basis B, is then given by the matrix product 


[Sle = [SplelSn—1le--- [Sie 


The reverse process is given by 


[Sip = 181) (ale Se 


262 


Chapter 4 Linear Transformations 


Thus, applying the matrices [S ‘les [S2], bits [Snlz! in succession reverses the pro- 
cess one transformation at a time. 


Translation 


A translation of a point in the plane moves the point vertically, horizontally, or both. 
For example, to translate the point (1, 3) three units to the right and two units up, add 
3 to the x coordinate and 2 to the y coordinate to obtain the point (4, 5). 


Now let v= | z be any vector in R? and b = | a some fixed vector. An 
2 2 


operation S: R? — R? of the form 


saven=| th 


v2 + b2 

is called a translation by the vector b. This transformation is a linear operator if 
and only if b = 0. Consequently, when b Æ 0, then S cannot be accomplished by 
means of a 2 x 2 matrix. However, by using homogeneous coordinates, translation of 
a vector in R? can be represented by a 3 x 3 matrix. The homogeneous coordinates 
of a vector in RÊ are obtained by adding a third component whose value is 1. Thus, 


; x ; 
the homogeneous coordinates for the vector v = | : are given by 


x 
w=] y 
1 


Now, to translate w by the vector b = | i | we let 
2 


1 0 hi 
A=] 0 1I bh 
0 0 1 
so that 
1 0 bı x x+b, 
Aw=/]0 1b y |=] y+b 
0 0 1 1 1 


To return to R2, we select the first two components of Aw so that 


= x+b 
a = | y +b 
as desired. 


As an illustration of this, let b = | = ; } Using homogeneous coordinates, the 


3 x 3 matrix to perform the translation is 


1 0 
A=j|0O 1 -2 
0 0 


Solution 
AY 
10+ 
2e T 
AAs] 
7A 
a Ht 
yl 
T = ROP 
Figure 10 


4.6 Application: Computer Graphics 263 


3 


Now let v = > |: Then 
1 0 1 3 4 
0 1 -2 2/=1{0 
0 0 1 1 1 


The vector S(v) = is the translation of v by the vector b. 


4 
0 

In the previous illustration the translation can be accomplished with less work 
by simply adding the vector b to v. The benefits of using a matrix representation are 
realized when we combine translation with other types of transformations. To do this, 
we note that all the previous linear operators can be represented by 3 x 3 matrices. 
For example, the 3 x 3 matrix for reflecting a point (in homogeneous coordinates) 
through the x axis is 


1 0 0 
0 -1 0 
0 0 1 


Find the image of the triangle T of Fig. 4 under a translation by the vector 


b= | 3 | followed by a horizontal scaling by a factor of 1.5, followed by 


a reflection through the x axis. 


The matrix for the composition of these operations is given by the product 


1 0 0 15 0 0 1 0 —-5 ILS 0 =7.5 
0 -1 0 0 1 0 O 1 3ļ=]| 0 -1 -3 
0 oO 1 0 0 1 0 0 1 0 0 1 
The vertices of the original triangle in homogeneous coordinates are given by 
2 1 
vV = 1 V = 1 V3 = 3 
1 1 
After applying the above matrix to each of these vectors, we obtain 
—7.5 —4.5 —6 
vi — —4 v = =i v3 = Eg 
1 1 1 


The resulting triangle, along with the intermediate steps, are shown in Fig. 10. 


Find a 3 x 3 matrix that will transform the triangle shown in Fig. 11(a) to the 
triangle shown in Fig. 11(b). 


264 Chapter 4 Linear Transformations 


Solution 


Rotation by 45° 


Figure 12 


(a) Triangle T (b) Triangle 7” 


Figure 11 


Triangle T’ is obtained from triangle T through a horizontal scaling by a factor of 
4, followed by a vertical scaling of 3, without changing the left vertex (1, 1). The 
scalings alone will move the point (1, 1) to (3, 2). One way to correct this is to 
first translate the triangle so that the left vertex is located at the origin, perform the 
scaling, and then translate back. The matrix to perform all these operations is the 
product of the matrices for each transformation. The matrix is given by 


Wo il 1 0 0 4 0 0 i @ =i ab O =3 
01 1 0 3 0 0 1 0 0 1 -1}=)]0 3 -2 
ORO I 0 0 1 0 0 1 0 0 1 0 O 1 
Notice that a 
i @ il i @ =i 
Ot tlt) Oo ft =i 
0 0 1 0 0 1 
that is, the matrix representation for translation by | | is the inverse of the matrix 


, = 
representation for translation by ee: 


Rotation 


Another common graphics operation is a rotation through an angle 9. See Fig. 12. To 
describe how a point is rotated, let (x, y) be the coordinates of a point in R? and 0 a 
real number. From trigonometry, the new coordinates (x’, y’) of a point (x, y) rotated 


by 9 rad about the origin are given by 
x’ = xcos@— ysin@ 
y’ = xsin + y cos ð 


If © > 0, then v is revolved about the origin in a counterclockwise direction. If 0 < 0, 
the direction is clockwise. These equations define a linear operator Sọ: R? > R? 


Solution 


Figure 13 


4.6 Application: Computer Graphics 265 


s x _ | xcos®@— ysin@ 
: y ~ | xsinð + ycosé 


The matrix of Sg relative to the standard basis B = {e1, e2} for R? is given by 


given by 


cos@ —sin0 
[Sola = | sind  cosð | 
When using homogeneous coordinates, we apply the matrix 
cos@ —sin@ 0 
sin cos@ 0 
0 0 1 


Find the image of the triangle of Fig. 4 under a translation by the vector b = 
| J | followed by a rotation of 30°, or 2/6 rad, in the counterclockwise 


direction. 
The matrix for the combined operations is given by 


: 3 1 
cose —sing 0 10 1 | + = o | 10 1 
sing cos? 0 01 -1})/=] 2 3355 0 1 SI 
0 0 e o o mii 
V3 WY 1 il 
2 o soa 
=| 1 2 Ee 
D y 2 Do 
o 0 1 
The vertices of the triangle in homogeneous coordinates are given by 
0 2 1 
viS || il Vo = | i and vs = || 3 
1 1 1 
After applying the above matrix to each of these vectors, we obtain 
a = D-i 
w=] 4 v=] 3 and v=] /341 
1 1 l 


The resulting triangle is shown in Fig. 13. 


Projection 


Rendering a picture of a three-dimensional object on a flat computer screen requires 
projecting points in 3-space to points in 2-space. We discuss only one of many methods 
to project points in R3 to points in R? that preserve the natural appearance of an object. 


266 Chapter 4 Linear Transformations 


Figure 14 


EXAMPLE 8 


Parallel projection simulates the shadow that is cast onto a flat surface by a far away 
light source, such as the sun. Shown in Fig. 14 are rays intersecting an object in 
3-space and the projection into 2-space. The orientation of the axes in Fig. 14 is such 
that the xy plane represents the computer screen. 

To show how to find the xy coordinates of a projected point, let the vector 


represent the direction of the rays. If (xo, yo, Zo) is a point in R3, then the parametric 
equations of the line going through the point and in the direction of vy are given by 


X(t) = xo + txa 
y@) = yo+ tya 
z(t) = Zo + tza 


for all t € R. The coordinates of the projection of (xo, yo, zo) onto the xy plane are 
found by letting z(t) = 0. Solving for t, we obtain 


Now, substituting this value of ż into the first two equations above, we find the 
coordinates of the projected point, which are given by 


Z0 Z0 
al T Jp = Vo m= Yd and Zp=0 


The components of vy can also be used to find the angles that the rays make with the 


z axis and the xz plane. In particular, we have 
2 2 
y Xa t Ya 
tan y = ae and tan ọ = —————_ 


Xd Zd 


where p is the angle vz makes with the xz plane and ọ is the angle made with the 
z axis. On the other hand, if the angles w and ọ are given, then these equations can 
be used to find the components of the projection vector vg. 


Let y = 30° and ọ = 26.6°. 


a. Find the direction vector vq and project the cube, shown in Fig. 15, into R?. 
The vertices of the cube are located at the points (0,0, 1), (1, 0, 1), (1, 0, 0), 
(0, 0, 0), (0, 1, 1), (1, 1, 1), CL, 1, 0), and (0, 1, 0). 


Solution 


b. 


a. 


4.6 Application: Computer Graphics 267 


Figure 15 
Find a 3 x 3 matrix that will rotate the (projected) vertices of the cube by 30° 
and another that will translate the cube by the vector 1 
We can arbitrarily set zg = —1. Then 


tanp = tan 30° ~ 0.577 = < and (tan)? = (tan26.6°)? © (0.5)? = eat 
Xd 


so that 
ya = USTs, and x + yy = I 
Solving the last two equations gives xg ~ 0.433 and ya ~ 0.25, so that the 


direction vector is 
0.433 


V= 0.25 
—1 
Using the formulas for a projected point given above, we can project each 
vertex of the cube into R?. Connecting the images by line segments gives the 
picture shown in Fig. 16. The projected points are given in Table 1. 


Table 1 


Vertex | Projected Point 
(0,0,1) | (0.433, 0.25) 
(1,0,1) | (1.433, 0.25) 
(1,0,0) | (1, 0) 

(0,0,0) | (0, 0) 

(0,1,1) | (0.433, 1.25) 
(1,1,1) | (1.433, 1.25) 
(1,1,0) 1a, 1D 

(0,1,0) | (0, 1) 


268 Chapter 4 Linear Transformations 


b. Using homogeneous coordinates, we find the matrices to rotate the cube coun- 


terclockwise by 30° and translate the cube by the vector | ; | are given by 


cos(z) —sin(Z) 0 102 
sin(g) cos(z) 0 and 01 1 
0 0 1 0 0 1 


respectively. Depictions of the results when the original cube is rotated and 
then the result is translated are shown in Figs. 17 and 18. 


oe 


Figure 16 Figure 17 Figure 18 
Exercise Set 4.6 O İC 
1. Find the matrix representation relative to the c. Ay 

standard basis for the linear transformation a: 

T: R? — R? that transforms the triangle with L 

vertices at the points (0, 0), (1, 1), and (2, 0) to 1 /\ 

the triangle shown in the figure. m ; 

=$ 5 x 


2. Find the matrix representation relative to the 

5t E eee standard basis for the linear transformation 

T: R? — R? that transforms the square with 
i vertices at the points (0,0), (1, 0), (1, 1), and 
-54 (0, 1) to the polygon shown in the figure. 

b. ay ae 


ay 


3. Let T: R? — R? be the transformation that 
performs a horizontal stretching by a factor of 3, 
followed by a vertical contraction by a factor of 2, 
followed by a reflection through the x axis. 


a. Find the matrix of T relative to the standard 
basis. 

b. Apply the transformation to the triangle with 
vertices (1, 0), (3, 0), and (2, 2), and give a 
sketch of the result. 

c. Find the matrix relative to the standard basis 
that reverses T. 


4. Let T: R? — R? be the transformation that 
performs a reflection through the y axis, followed 
by a horizontal shear by a factor of 3. 

a. Find the matrix of T relative to the standard 
basis. 

b. Apply the transformation to the rectangle with 
vertices (1, 0), (2, 0), (2,3), and (1, 3), and 
give a sketch of the result. 

c. Find the matrix relative to the standard basis 
that reverses T. 


5. Let T: R? —+ R? be the transformation that 
performs a rotation by 45°, followed by a 
reflection through the origin. 

a. Find the matrix of T relative to the standard 
basis. 


7. 


9. 


4.6 Application: Computer Graphics 269 


b. Apply the transformation to the square with 
vertices (0, 0), (1, 0), (1, 1), and (0, 1), and 
give a sketch of the result. 

c. Find the matrix relative to the standard basis 
that reverses T. 


. Let T: R? — R? be the transformation that 


performs a reflection through the line y = x, 

followed by a rotation of 90°. 

a. Find the matrix of T relative to the standard 
basis. 

b. Apply the transformation to the triangle with 
vertices (0, 0), (2,0), and (1, 3), and give a 
sketch of the result. 

c. Find the matrix relative to the standard basis 
that reverses T. 

d. Describe this transformation in another way. 
Verify your answer. 


Let T: R? — R? be the (nonlinear) 
transformation that performs a translation by the 


vector | , followed by a rotation of 30°. 


a. Using homogeneous coordinates, find the 3 x 3 
matrix that performs the translation and 
rotation. 

b. Apply the transformation to the parallelogram 
with vertices (0, 0), (2, 0), (3, 1), and (1, 1), 
and give a sketch of the result. 

c. Find the matrix that reverses T. 


. Let T: R? — R? be the (nonlinear) 


transformation that performs a translation by the 


vector E , followed by a reflection through 


the y axis. 


a. Using homogeneous coordinates, find the 3 x 3 
matrix that performs the translation and 
reflection. 


b. Apply the transformation to the trapezoid with 
vertices (0, 0), (3, 0), (2, 1), and (1, 1), and 
give a sketch of the result. 

c. Find the matrix that reverses T. 


Let 


270 Chapter 4 Linear Transformations 


be a basis for R?, and let A be the triangle in the 

xy coordinate system with vertices (0, 0), (2, 2), 

and (0, 2). 

a. Find the coordinates of the vertices of A 
relative to B. 

b. Let T be the transformation that performs a 
reflection through the line y = x. Find [oe 
where S is the standard basis for R?. 

c. Apply the matrix found in part (b) to the 
coordinates found in part (a). Sketch the 
result. 

d. Show the same result is obtained by applying 


| K : to the original coordinates. 


10. Let 


be a basis for R?, and let A be the parallelogram 
in the xy coordinate system with vertices 
(0, 0), (1, 1), (1, 0), and (2, 1). 


a. Find the coordinates of the vertices of A 
relative to B. 


b. Find the matrix representation relative to B of 
the transformation T that performs a reflection 
through the horizontal axis. 

c. Apply the matrix found in part (b) to the 
coordinates found in part (a). Write the 
resulting vectors relative to the standard basis, 
and sketch the result. 

d. Find the matrix representation relative to the 
standard basis for the transformation that 
performs the same operation on the 
parallelogram. Apply this matrix to the original 
coordinates, and verify the result agrees with 
part (c). 


Review Exercises for Chapter 4 


1. Let T: R? — Rt be a linear transformation. 
a. Verify that 


(iH) 


is a basis for R?. 
b. If 


NON= 


determine 


(Hint: Find the coordinates of | 5 | relative 
to S.) 


c. Describe all vectors in N (T). 
d. Is the linear map T one-to-one? Explain. 
e. Find a basis for R(T). 
f. Is T onto? Give two reasons. 
g. Find a basis for R4 that contains the vectors 
1 -1 ] 
0 1 
1 and 0 
1 1 


h. use the basis B= {|3 |. | orm 


and the basis found in part (g) for R4, call it C, 
and find a matrix representation for T with 
respect to the bases B and C. 


i. Apply the matrix A found in part (h) to an 


arbitrary vector 


. Define linear transformations S, T: P, — P4 and 
H: P4 > P4 by 


S(p(x)) = p' 0) 
T(p(x)) = & + Dp) 
H (p(x)) = p'(x) + p0) 


a. Compute HoT and So(HcT). 

b. Find the matrix for S, T, and H relative to the 
standard bases for P3 and P4. 

c. Show that T is one-to-one. 

d. Find R(T). 


. Let S, T: R? > R? be transformations so S 

reflects every vector through the x axis and T 

reflects every vector through the y axis. 

a. Give definitions for S and T. Show the 
mappings are linear transformations. 

b. Find the matrix for S and for T relative to the 
standard basis for R?. 

c. Find the matrix for the linear transformations 
ToS and SoT. Describe geometrically the 
action of ToS and SoT. 


. a. Let T: Mox2 > Mox2 be defined by 


ra=] 4 na 


Is T a linear transformation? Is T one-to-one? 
Is T an isomorphism? 


b. Let T: M2x2 —> M2x2 be defined by 
1 0 
T(A) = | 10 A 
Is T a linear transformation? Is T one-to-one? 
Show R(T) is isomorphic to R?. 


. Let vı and v2 be linearly independent vectors in 


R? and T: R? — R? a linear operator such that 
T (V1) = Vo and T (v2) =V1 


Let B = {vj, vo} and B’ = {v2, vj}. 


4.6 Application: Computer Graphics 271 


a. Find [T]z. 
b. Find [T]}. 


6. Let T: R? — R? be the linear operator that 


projects a vector across the line span { | = | \ 


and S: R? —> R? the linear operator that reflects a 
vector across the line span { | 7 | Let B 


denote the standard basis for R?. 
a. Find [T]g and [S] z. 


amardi (N) 


c. Find the matrix representation for the linear 
operator H: R? —> R? that reflects a vector 


across the subspace span { | = \ and 


across the subspace span { | i \ 


d. Find H (| = 


e. Find N(T) and N(S). 
f. Find all vectors v such that T(v) = v and all 
vectors v such that S(v) = v. 


. Let T: R? — R? be the linear operator that 


reflects a vector across the plane 


1 0 
span 0O |,| 1 
0 1 


The projection of a vector u onto a vector v is the 
vector 
é u-v 
proj, u = — vV 
v-v 
and the reflection of v across the plane with 
normal vector n is 


v — 2 proj, V 


Let B denote the standard basis for RÌ. 
a. Find [T]p. 
—1 
b. Find T 2 
1 


272 Chapter 4 Linear Transformations 


c. Find N(T). 10. Let T: R? — R? be a linear operator. 
d. Find R(T). a. Show that the line segment between two 
e. Find the matrix relative to B for T”,n > 2. vectors u and v in R? can be described by 
8. Define a transformation T: P2 > R by tu+(1—1t)v for O<?=1 
T(p(x)) = i p(x) dx b. Show that the image of a line segment under 
0 the map T is another line segment. 
a. Show that T is a linear transformation. c. A set in R? is called convex if for every pair 
b. Compute T(—x? — 3x + 2). of points in the set, the line segment between 
c. Describe N (T). Is T one-to-one? the points is in the set. See the figure. 
d. Find a basis for N (T). 
e. Show that T is onto. 
f. Let B be the standard basis for P2 and 
B' = {1}, a basis for R. Find [T]8’. 


g. Compute T(—x? — 3x + 2), using the matrix 


found in part (f). Convex set Not a convex set 
h. Define linear operators T: C M10, 1] > 
C®[0, 1] and S: C[0, 1] > C®Y[0, 1] by Suppose T: R? —> R? is an isomorphism and S 


d is a convex set in R?. Show that T(S) is a 
T(f)= an convex set. 
d. Define T: R? — R? by 


and 
z x 2x 
S(f) =F, where ra) = [ f(t) dt r(| \)=| | 
A y y 
Find T(xe*) and S(xe*). Describe SoT and Show that T is an isomorphism. Let 
ToS. 
; S= x 24 y2 =] 
9. Let T: V — V be a linear operator such that = y oy = 
T? — T + I = 0, where J denotes the identity , , 
mapping. Show that TT! exists and is equal to Describe the image of the set S under the 
TT: transformation T. 
Chapter 4: Chapter Test 
In Exercises 1—40, determine whether the statement is 2. The transformation T: R —> R defined by 
true or false. T(x) = 2x — 1 is a linear transformation. 


3. If b = 0, then the transformation T: R > R 
defined by T(x) = mx + b is a linear 


transformation. 
y x+y+2 4. If A is an m x n matrix, then T defined by 
T (v) = Av 
is a linear transformation. is a linear transformation from R” into R”. 


1. The transformation T: R? —> R? defined by 


12. 


13. 


14. 


. Let u = 


. Let A be a fixed matrix in Mnxn. Define a 


transformation T: Miyxn > Mnxn by 
T(B) = (B + A} — (B + 2A) (B — 3A) 


If A? = 0, then T is a linear transformation. 


adv= [4 


| HIET R? — Rè is 


1 
0 
a linear operator and 


T(Uu+v)=Vv and TQu—v)=u+v 


then 


T (u) = | 


Whe WIN 
[| 


. If T: R? — R? is defined by 


aD 2]b5 


then T is an isomorphism. 


. If T: V —> W is a linear transformation and 


{V1,.--, Vn} is a linearly independent set in V, 
then {T (v1), ..., T (Vn)} is a linearly independent 
subset of W. 


. The vector spaces Pg and M3x3 are isomorphic. 


10. 


If a linear map T: P4 — P; is defined by 
T(p(x)) = p’(x), then T is a one-to-one map. 


. If A is ann x n invertible matrix, then as a 


mapping from R” into R” the null space of A 
consists of only the zero vector. 


The linear operator T: R? —> R? defined by 


ee ee 


If T: R? — R? is the transformation that reflects 
each vector through the origin, then the matrix for 
T relative to the standard basis for R? is 


—1 0 
0 -I 
A linear transformation preserves the operations 
of vector addition and scalar multiplication. 


15. 


16. 


17. 


19. 


20. 


21. 


23. 


24. 


4.6 Application: Computer Graphics 273 


Every linear transformation between finite 
dimensional vector spaces can be defined using a 
matrix product. 


A transformation T: V —> W is a linear 
transformation if and only if 


T (c1v1 + €2V2) = ci T (v1) + €2T (v2) 


for all vectors v; and vz in V and scalars cı 
and c2. 


If f: R — R is a linear operator and 
ọ: R? — R? is defined by 


g(x, y) eo (x,y Eä f(x)) 


then the mapping @ is an isomorphism. 


. Let U, V, and W be finite dimensional vector 


spaces. If U is isomorphic to V and V is 
isomorphic to W, then U is isomorphic to W. 


If T: V — V is a linear operator and u € N(T), 
then 
T(cu+v) =T(y) 


for all v € V and scalars c. 


If P: R? — R? is the projection defined by 
Pi yy el We 


then P? = P. 


If T: V > W is a linear transformation between 
vector spaces such that T assigns each element of 
a basis for V to the same element of W, then T is 
the identity mapping. 


. If T: R* > ® and dim(N(T)) = 2, then 


dim(R(T)) = 3 


If T: Rt > R5 and dim(R(T)) = 2, then 
dim(N(T)) = 2. 


If T: R? — R? is defined by 
x 2x—yt+zZ 
T y = x 
z y-x 


274 


25. 


26. 


27. 


28. 


29. 


30. 


31. 


32. 


Chapter 4 Linear Transformations 


then the matrix for T7! relative to the standard 
basis for R? is 
0 0 
0 0 1 
-1 — 1 
If T: R? — R? is defined by 


1 

2 
a e 

y —x+ y 
B = {e,, ez}, and B’ = {e2, ei}, then 
' —1 1 

There exists a linear transformation T between 
vector spaces such that T(0) 4 0. 


The linear transformation T: R? — R? defined by 
x x 
T y =| 0 
Zz y 


projects each vector in R? onto the xy plane. 


The linear operator T: R? —> R? defined by 


(De Lb oll 


reflects each vector in R? across the line y = x. 


Let T: V — W be a linear transformation and 

B = {v1,..., Vn} a basis for V. If T is onto, then 
{T(v1),..., T(Vn)} is a basis for W. 

The vector space Pz is isomorphic to the subspace 
of R5 


a,b,ceR 


= 
II 
aoa 


j=) 


If T: V — V is the identity transformation, then 
the matrix for T relative to any pair of bases B 
and B’ for V is the identity matrix. 


If T: R? — R? is defined by 
x x+y+z 
T y = y-x 
z y 


then dim(N(T)) = 1. 


33. 


34. 


35. 


36. 


37. 


38. 


40. 


If T: Pz —> Pa is defined by 

T (ax? + bx +c) = 2ax +b 
then a basis for N (T) is {—3}. 
If T: Mox2 > Mox2 is defined by 

T(A) = A4? — A 

then N (T) = {0}. 
If T: Pa — P3 is defined by 

T (p(x)) = p" (x) — xp'(x) 
then T is onto. 


If T: Pa — P3 is defined by 
T (p(x)) = p" (x) — xp'(x) 
then q(x) = x? is in R(T). 


The linear operator T: R? —> R? defined by 


x 3 -3 0 x 
Z 3 -1 1 Zz 


is an isomorphism. 
If A is an m x n matrix and T: R” > R” is 
defined by 

T(v) = Av 


then the range of T is the set of all linear 
combinations of the column vectors of A. 


. If A is anm xn matrix with m > n and 


T: R” —> R” is defined by 
T (v) = Av 
then T cannot be one-to-one. 


If A is an m x n matrix with m > n and 
T: R” > R” is defined by 


T (v) = Av 
then T cannot be onto. 


Eigenvalues and 
Eigenvectors 


WNEHAPTER OUTLINE 5.1 Eigenvalues andEigenvectors 276 
5.2 Diagonalization 287 


5.3 Application: Systems of Linear Differential Equations 
5.4 Application: Markov Chains 310 


Markov chain is a mathematical model used 

to describe a random process that, at any 
given time t = 1,2,3,..., is in one of a finite 
number of states. Between the times f and t + 1 
the process moves from state j to state i with 
a probability p;;. Markov processes are also 
memoryless; that is, the next state of the sys- 
tem depends only on the current state. As an 
example, consider a city C with surrounding res- 
idential areas N, S, E, and W. Residents can 


(CY move between any two locations or stay in their 
current location, with fixed probabilities. In this 
case a State is the location of a resident at any 

O an given time. The state diagram shown in Fig. 1 
X describes the situation with the probabilities of 
\ 7 moving from one location to another shown in the 

corresponding transition matrix A = (pij). For 


HO — E example, entry p34 = 0.2 is the probability that nein 


Figure 1 a resident in region E moves to region S. Since a resident is assumed to be living 
in one of the five regions, the probability of being in one of these regions is 1, and 
hence each column sum of A is equal to 1. A square matrix with each entry between 

C N S E W 0 and 1 and column sums all equal to 1 is called a stochastic matrix. The initial 
distribution of the population is called the initial probability vector. Assume that the 
| initial population distribution is given by the vector 


0.3 
0.2 
v= | 0.1 
0.2 
0.2 


=muzoa 
—— 


275 


276 


Chapter 5 Eigenvalues and Eigenvectors 


5.1 > 


DEFINITION 1 


Then the population distribution after one time step is Av, after two time steps is 
A(Ay) = Ay, and so on. For example, after 10 time steps, the population distribution 
(rounded to two decimal places) is 


0.21 
0.20 
Av = | 0.16 
0.20 
0.23 


Notice that the sum of the entries of the population distribution vector is equal to 1. 
Starting with some initial distribution vector, the long-term behavior of the Markov 
chain, that is, A”v as n tends to infinity, gives the limiting population distribution 
in the five regions into the future. When A”v approaches a distribution vector s 
as n tends toward infinity, we say that s is the steady-state vector. If a transition 
matrix for a Markov chain is a stochastic matrix with positive terms, then for any 
initial probability vector v, there is a unique steady-state vector s. Moreover, if s is 
the steady-state vector, then As = s. Finding the steady-state vector is equivalent to 
solving the matrix equation 
AX = Xx 


with à = 1. In general, if there is a scalar à and a nonzero vector v such that AV = iv, 
then > is called an eigenvalue for the matrix A and v is an eigenvector corresponding 
to the eigenvalue >. In our Markov chain example, the steady-state vector corresponds 
to the eigenvalue = 1 for the transition matrix A. 

In the last decade the growth in the power of modern computers has, quite mirac- 
ulously, made it possible to compute the eigenvalues of a matrix with rows and 
columns in the billions. Google’s page rank algorithm is essentially a Markov chain 
with transition matrix consisting of numerical weights for each site on the World Wide 
Web used as a measure of its relative importance within the set. The algorithm was 
developed by Larry Page and Sergey Brin, the founders of Google. 


For any n x n matrix A, there exists at least one number-vector pair ~, v such that 
Av = xv (although > may be a complex number). That is, the product of A and v is 
a scaling of the vector v. Many applications require finding such number-vector pairs. 


Eigenvalues and Eigenvectors 


One of the most important problems in linear algebra is the eigenvalue problem. It 
can be stated thus: If A is an n x n matrix, does there exist a nonzero vector v such 
that Av is a scalar multiple of v? 


Eigenvalue and Eigenvector Let A be ann x n matrix. A number > is called 
an eigenvalue of A provided that there exists a nonzero vector v in R” such that 


AV =v 


Solution 


5.1 Eigenvalues and Eigenvectors 277 


Every nonzero vector satisfying this equation is called an eigenvector of A corre- 
sponding to the eigenvalue ^. 


The zero vector is a trivial solution to the eigenvalue equation for any number ^ 
and is not considered as an eigenvector. 
As an illustration, let 


Observe that 


so Vj = : is an eigenvector of A corresponding to the eigenvalue ^; = 1. We 
also have 

1 2 Delt =) ae oe 1 

0 -il =1 |7 N A —1 
SO V2 = E is another eigenvector of A corresponding to the eigenvalue ^2 = — 1. 


In Example 1 we show how to find eigenvalues and eigenvectors for a 2 x 2 
matrix. 


Let 


a. Find the eigenvalues of A. 
b. Find the eigenvectors corresponding to each of the eigenvalues found in 
part (a). 


a. The number ^ is an eigenvalue of A if there is a nonzero vector v = | l | 


such that 


0 1 a || x ae : | ees 
É O e which is equivalent to Bel 


This matrix equation is equivalent to the homogeneous linear system 
=e a vy = 0 
x = hy = O 


By Theorem 17 of Sec. 1.6, the linear system has a nontrivial solution if and 
only if 


278 Chapter 5 Eigenvalues and Eigenvectors 


Consequently, ^ is an eigenvalue of A if and only if 


”-1=0  sothat 2y=1 and ^ =-1 
b. For }; = 1, a vector vj = | ; | is an eigenvector if 
0 1 eee] || 
1 0 yi |» 
This yields the linear system 


{ aa with solution set SS= { | f | 


rer} 


t : : 
Thus, any vector of the form vj = | : | for t Æ 0, is an eigenvector corre- 
sponding to the eigenvalue ^; = 1. In a similar way, we find that any vector 
t i 7 3 
of the form v2 = | = | for t Æ 0, is an eigenvector of A corresponding to 


the eigenvalue ^2 = —1. Specific eigenvectors of A can be found by choosing 
any value for ¢ so that neither vı nor v2 is the zero vector. For example, letting 
t = 1, we know that 
1 
1 


is an eigenvector corresponding to ^; = 1 and 


s-i 


is an eigenvector corresponding to ^2 = —1. 


Geometric Interpretation of Eigenvalues and Eigenvectors 


A nonzero vector v is an eigenvector of a matrix A only when Av is a scaling of 


; z . Using the techniques just introduced, 


the eigenvalues of A are M; =2 and ^2 =3 with corresponding eigenvectors 


the vector v. For example, let A = | 


v = | E and v2 = | = } respectively. Observe that 


w[i lii] 


and 


Figure 2 


THEOREM 1 


5.1 Eigenvalues and Eigenvectors 279 


In Fig. 2, we provide sketches of the vectors vı, V2, Avı, and Av» to underscore that 
the action of A on each of its eigenvectors is a scaling. Observe that this is not the 


1 
1 } then 


epi Ee 


Notice that in Example 1, for each of the eigenvalues there are infinitely many eigen- 
vectors. This is the case in general. To show this, let v be an eigenvector of the matrix 
A corresponding to the eigenvalue ~. If c is any nonzero real number, then 


A(cv) = cA (v) = c(Av) = (cy) 


case for an arbitrary vector. For example, if v = | 


Eigenspaces 


so cv is another eigenvector associated with the eigenvalue . Notice that all eigen- 
vectors corresponding to an eigenvalue are parallel but can have opposite directions. 

Building on the procedure used in Example 1, we now describe a general method 
for finding eigenvalues and eigenvectors. If A is an n x n matrix, then 


AV =v 
for some number > if and only if 
Av—iv=0 that is (A —i1)v = Av— rIv=0 


Again by Theorem 17, of Sec. 1.6, this equation has a nontrivial solution if and only 
if 

det(A — XI) =0 
We summarize this result in Theorem 1. 


The number ^ is an eigenvalue of the matrix A if and only if 


det(A — AI) =0 


The equation det(A — ^I) = 0 is called the characteristic equation of the matrix 
A, and the expression det(A — XJ) is called the characteristic polynomial of A. If 
A is ann x n matrix and à is an eigenvalue of A, then the set 


V = {v € R” | Av = iv} 


is called the eigenspace of A corresponding to ^. Notice that V} is the union of the 
set of eigenvectors corresponding to ^ and the zero vector. 

We have already shown that V, is closed under scalar multiplication. Therefore, 
to show that V, is a subspace of R”, we need to show that it is also closed under 
addition. To see this, let u and v be vectors in V); that is, Au = Xu and Av = dv for 
a particular eigenvalue >. Then 


A(u + v) = Au + Av = u + AV= A(U+ v) 


280 Chapter 5 Eigenvalues and Eigenvectors 


Alternatively, the set 
V = {ve R” | Av = xy} = {v e R” | (A-ADvV=0} = N(A — I) 


Since V, is the null space of the matrix A — XJ, by Theorem 3 of Sec. 4.2 it is a 
subspace of R”. 


| EXAMPLE 2 | Find the eigenvalues and corresponding eigenvectors of 
2 —12 
‘=| 
Give a description of the eigenspace corresponding to each eigenvalue. 


Solution By Theorem 1 to find the eigenvalues, we solve the characteristic equation 


2-2 -12 
det(A — J) = 1 ey | 
= (2—)(-5 — 4) — CG) (12) 
=74+3X+2 
=A+)A+2) =0 
Thus, the eigenvalues are }; = —1 and ^2 = —2. To find the eigenvectors, we need 


to find all nonzero vectors in the null spaces of A — ~; Z and A — M21. First, for 


A= l, 
2 -12 1 0 3 —12 
A-ul=a+1=] e Hel. e 
The null space of A + J is found by row-reducing the augmented matrix 


3. -P0 F 1. ie 
0 g o olo 


1 —4 
: ee ee: 4t 
The solution set for this linear system is given by S = i 


tE R} Choosing 


t = 1, we obtain the eigenvector vı = | 1 } Hence, the eigenspace corresponding 


to 4; = — 1 is 
Wea = fı | ; | t is any real number} 
For M2 = —2, 
a 12) 
fae] 


In a similar way we find that the vector v2 = is an eigenvector corresponding 


1 
to Ay = —2. The corresponding eigenspace is 


5.1 Eigenvalues and Eigenvectors 281 


nfi] 


The eigenspaces V,, and V}, are lines in the direction of the eigenvectors | : | 


t is any real number} 


and | : I respectively. The images of the eigenspaces, after multiplication by 


A, are the same lines, since the direction vectors A | i | and A | 3 


1 | are scalar 


multiples of | a | and | 1 | respectively. 


In Example 3 we illustrate how the eigenspace associated with a single eigenvalue 
can have dimension greater than 1. 


| EXAMPLE 3 | Find the eigenvalues of 


1 0 0 0 
0 1 5 —10 
fi 1 @ 2 0 
1 0 0 3 


and find a basis for each of the corresponding eigenspaces. 


Solution The characteristic equation of A is 


a O € 0 
det(A = 7 E AN SH m a Doe 
i 0 oO =a 


Thus, the eigenvalues are 


Ai = il p=” and 3 =3 


Since the exponent of the factor > — 1 is 2, we say that the eigenvalue ^; = 1 has 
algebraic multiplicity 2. To find the eigenspace for ; = 1, we reduce the matrix 


000 0 100 2 
O S it 0 0 1 2 
EVE oai o 000 0 
100 2 000 0 


282 


Chapter 5 Eigenvalues and Eigenvectors 


Hence, the eigenspace corresponding to ^; = 1 is 


0 —2 
1 0 
Vi=<s 0 +t > |S teR 
0 1 
Observe that the two vectors 

0 —2 

1 0 

0 and 2 

0 1 


are linearly independent and hence form a basis for V,,. Since dim(V;,) = 2, we 
say that ~; has geometric multiplicity equal to 2. Alternatively, we can write 


0 —2 
0 
V,, = span ol 2 
0 1 
Similarly, the eigenspaces corresponding to ^2 = 2 and 3 = 3 are, respectively, 
0 0 
5 —5 
Vy, = span | 1 | and — V = span | 0 | 
0 1 


In Example 3 the algebraic and geometric multiplicities of each eigenvalue are 
equal. This is not the case in general. For example, if 


an 


then the characteristic equation is (à — 1)? = 0. Thus, à = 1 has algebraic multiplicity 


2. However, 
t 
Y= { | 0 | te r} 


so } = 1 has geometric multiplicity 1. 

Although eigenvectors are always nonzero, an eigenvalue can be zero. Also, as 
mentioned at the beginning of this section, eigenvalues can be complex numbers. 
These cases are illustrated in Example 4. 


Find the eigenvalues of 


Solution 


PROPOSITION 1 


DEFINITION 2 


5.1 Eigenvalues and Eigenvectors 283 


The characteristic equation is 


—h 0 0 
det(A- A) =| 0 =A —) |= eS eee 
0 1 =A 
Thus, the eigenvalues are \; = 0, \2 = i, and 43 = —i. The corresponding eigen- 
vectors are 
1 0 0 
0 1 and 1 
0 —i i 


A fact that will be useful in the next section has to do with the eigenvalues of a 
square triangular matrix. For example, let 


2 4 
ee 
Since det(A — XJ) = 0 if and only if (2 — 4)(—3 — 4) = 0, we see that the eigenval- 


ues of A are precisely the diagonal entries of A. In general, we have the following 
result. 


The eigenvalues of an n x n triangular matrix are the numbers on the diagonal. 


Proof Let A be an n xn triangular matrix. By Theorem 13 of Sec. 1.6, the 
characteristic polynomial is given by 


det(A — AF) = (a11 — X) (a22 — X) +++ (ann — N) 
Hence, det(A — XJ) = 0 if and only if X; = a11, 42 = an, ..., Nn = Ann- 


Eigenvalues and Eigenvectors of Linear Operators 


The definitions of eigenvalues and eigenvectors can be extended to linear operators. 


Eigenvalue and Eigenvector of aLinear Operator Let V be a vector space 
and T:V — V be a linear operator. A number } is an eigenvalue of T provided 
that there is a nonzero vector v in V such that T(v) = dv. Every nonzero vector that 
satisfies this equation is an eigenvector of T corresponding to the eigenvalue i. 


As an illustration define T: P2 > P2 by 
T (ax? + bx +c) = (-a + b + c)x? + (—b — 2c)x — 2b — c 


Observe that 
Pex + x + 1) = 3x? = 3x — 3 = —3(—x? +x + 1) 


284 Chapter 5 Eigenvalues and Eigenvectors 


Solution 


so p(x) = —x*+x-+4+1 is an eigenvector of T corresponding to the eigenvalue 
= 3. 
Example 5 is from ordinary differential equations. 


Interpret the solutions to the equation 
f(x) = kf (x) 


as an eigenvalue problem of a linear operator. 


Let D denote the collection of all real-valued functions of one variable that have 
derivatives of all orders. Examples of such functions are polynomials, the trigono- 
metric functions sin(x) and cos(x), and the natural exponential function e* on R. 
Define a linear operator T: D — D by 


T(f(x)) = f'@) 


Then > is an eigenvalue of T if there is a function f(x), not identically zero, such 
that T(f(x)) = Xf (x). That is, f(x) satisfies the differential equation 


f(x) = Xf) 
Nonzero solutions to this differential equation are eigenvectors of the operator T, 
called eigenfunctions, corresponding to the eigenvalue X. The general solution to 
this equation is given by 
fœ) = ke™ 
where k is an arbitrary constant. This class of functions is a model for exponential 
growth and decay with extensive applications. 


Fact Summary 


Let A be an n x n matrix. 
1. The number > is an eigenvalue of A if and only if det(A — J) = 0. 
2. The expression det(A — ^I) is a polynomial of degree n. 
3. If v is an eigenvector of A and c is a nonzero scalar, then cv is another 
eigenvector of A. 
4. If ^ is an eigenvalue of A, then the eigenspace 
V = {v € R” | Av = dv} 


is a subspace of R”. 


5.1 Eigenvalues and Eigenvectors 285 


5. The eigenspace corresponding to is the null space of the matrix A — XJ. 


6. The eigenvalues of a square triangular matrix are the diagonal entries. 


Exercise Set 5.1 O ü (O 


In Exercises 1—6, a matrix A and an eigenvector v are In Exercises 7—16, a matrix A is given. 


given. Find the corresponding eigenvalue directly by a. Find the characteristic equation for A. 
solving Av = dv. b. Find the eigenvalues of A. 
13 0 _ | 0 c. Find the eigenvectors corresponding to each 
1. A= v= ; 
1 3 1 eigenvalue. 
—] 1 = d. Verify the result of part (c) by showing that 
2. A= = v= AV; = NiVi. 
0 -2 
23). 3 1 7. A= | EE | 
3. A= | -1 -2 1 v=|0 K 
-3 2 3 1 = 
ies) 
1 0 1 
ae ee 2 1-2 
3 0 -1 9 A= | 0 1 | 
4 
-4 2 
4 
—1 0 1 
: i i a 11. A=| 01 0 
= 0 2 -l1 
Bet ty 120) 20 
0 1 0 1 0 20 
12. A= | 0 -1 1 
-1 0 01 
_ 0 
VS Ml oo 2 1 2 
1 13. A=]0 2 -l 
0 1 0 
1 1 
-1 -1 0 -1 1 1 1l 
6 A= 1 1 0 1 14.. A=|0 1 0 
=p -=j 0 0 1 
0 -1 0 00 
1 02 0 0 
oa ee RN 0 3, aoe 
0 00 04 


286 


16. 


17. 


18. 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


Chapter 5 Eigenvalues and Eigenvectors 


e E 
012 1 

A=|o 02 
0 0 0 Si 


Show that if X? + bX + c is the characteristic 
polynomial of the 2 x 2 matrix A, then 

b = —tr(A) and c = det(A). 

Let A be an invertible matrix. Show that if > is 
an eigenvalue of A, then 1/2 is an eigenvalue of 
Aq}, 

Let A be ann x n matrix. Show that A is not 


invertible if and only if ^ = 0 is an eigenvalue 
of A. 


Let V be a vector space with dim(V) = n and 
T: V — V a linear operator. If + is an 
eigenvalue of T with geometric multiplicity n, 
then show that every nonzero vector of V is an 
eigenvector. 


Let A be an idempotent matrix. Show that if ^ is 
an eigenvalue of A, then } =O or = 1. 


Show that A and A’ have the same eigenvalues. 
Give an example to show A and A’ can have 
different eigenvectors. 


Show that if there is a positive integer n such that 
A” = 0, then X = 0 is the only eigenvalue of A. 


1 0 
0 -l 


T: Mo y.2 =. Mo .2 by 
T(B) = AB — BA 


Let A = | . Define an operator 


a. Show that e = is an eigenvector 


0 1 
0 0 
corresponding to the eigenvalue ^ = 2. 

b. Show that f = | : i is an eigenvector 


corresponding to the eigenvalue } = —2. 


Let A and B be n x n matrices with A invertible. 
Show that AB and BA have the same eigenvalues. 


Show that no such matrices A and B exist such 
that 
AB—BA=I 


27. 


28. 


29. 


30. 


31. 


32. 


33. 


34. 


Show that the eigenvalues of a square triangular 
matrix are the diagonal entries of the matrix. 


Let > be an eigenvalue of A. Use mathematical 
induction to show that for all n in the set of all 
natural numbers N, if > is an eigenvalue of A, 
then A” is an eigenvalue of A”. What can be said 
about corresponding eigenvectors? 


Let C = B~'AB. Show that if v is an eigenvector 
of C corresponding to the eigenvalue ^, then Bv 
is an eigenvector of A corresponding to i. 


Let A be ann x n matrix and suppose V1, ..., Vin 
are eigenvectors of A. If S = span{v),..., Vm}, 
show that if v € S, then Ave S. 


Let T: R? — R? be the linear operator that 
reflects a vector through the x axis. Find the 
eigenvalues and corresponding eigenvectors for T. 


Define a linear operator T: R? —> R? by 


eats 


Show that the only eigenvalues of T are } = + 
Find the corresponding eigenvectors. 


T 
— 


Define a linear operator T: R? —> R? by 


a [i5] 


That is, the action of T is a counterclockwise 


— sind 
cos 0 


cos ð 
sin 0 


rotation of a vector by a nonnegative angle 0. 
Argue that if 0 Æ 0, x, then T has no real 
eigenvalues; if 0 = 0, then > = 1 is an eigenvalue; 
and if 0 = x, then } = —1 is an eigenvalue. 


Let D denote the function space of all real-valued 
functions that have two derivatives, and define a 
linear operator T on D by 


T(f) = f" —2f' —3f 


a. Show that for each k, the function f(x) = ek 
is an eigenfunction for the operator T. 


b. Find the corresponding eigenvalues for each 


eigenfunction f(x) = e". 


5.2 Diagonalization 287 


c. Find two nonzero functions f such that a. Find the matrix representation for T relative to 
the basis B. 
"(x) — 2f'(x) —3f() =0 
KE oe a) b. Find the matrix representation for T relative to 
35. Define a linear operator T: P > Pz by the basis B’. 
T (ax? + bx +c) = (a — b)x? + cx c. Show that the eigenvalues for the matrices 


found in parts (a) and (b) are the same. 
Define two ordered bases for P2 by 


B ={x — 1,x +1, x?} and B’ = {x + 1, 1, x7}. 


5.2 > Diagonalization 


Many applications of linear algebra involve factoring a matrix and writing it as the 
product of other matrices with special properties. For example, in Sec. 1.7, we saw 
how the LU factorization of a matrix can be used to develop efficient algorithms for 
solving a linear system with multiple input vectors. In this section, we determine if a 
matrix A has a factorization of the form 


A= PDP! 


where P is an invertible matrix and D is a diagonal matrix. The ideas presented here 
build on the concept of similarity of matrices, which we discussed in Sec. 4.5. Recall 
that if A and B are n x n matrices, then A is similar to B if there exists an invertible 
matrix P such that 

B = P'AP 
If B is a diagonal matrix, then the matrix A is called diagonalizable. Observe that if 
D is a diagonal matrix, then A is diagonalizable if either 


D = P'AP or A = PDP! 


for some invertible matrix P. One of the immediate benefits of diagonalizing a matrix 
A is realized when computing powers of A. This is often necessary when one is solving 
systems of differential equations. To see this, suppose that A is diagonalizable with 


A= PDP"! 


Then 
A? = (PDP~!)(PDP~!) = PD(P"!P)DP~! = PD? P~! 


Continuing in this way (see Exercise 27), we see that 
A’ = Pp* p=! 


for any positive whole number k. Since D is a diagonal matrix, the entries of D* are 
simply the diagonal entries of D raised to the k power. 

As we shall soon see, diagonalization of a matrix A depends on the number of 
linearly independent eigenvectors, and fails when A is deficient in this way. We note 
that a connection does not exist between a matrix being diagonalizable and the matrix 
having an inverse. A square matrix has an inverse if and only if the matrix has only 
nonzero eigenvalues (see Exercise 19 of Sec. 5.1). 


288 Chapter 5 Eigenvalues and Eigenvectors 


Solution 


THEOREM 2 


TZ 
A=| 2 | 0 
0 0 -3 
Show that A is diagonalizable with 
i i @ 
=| =i 1 
0 0 1 
The inverse matrix is given by 
1 1 
| 
P=) | amen 
0 O 1 
so that 
~ 0 O 
p'ap=| 03 0 
0 =3 


Therefore, the matrix A is diagonalizable. 


The diagonal entries of the matrix P~'AP, in Example 1, are the eigenvalues of 
the matrix A, and the column vectors of P are the corresponding eigenvectors. For 
example, the product of A and the first column vector of P is given by 


12 0 
Aj -l]=j);]2 1 0 -1 | =-—1 j| -l 
0 0 0 -3 0 0 


Similarly, the second and third diagonal entries of P~'AP are the eigenvalues of A 
with corresponding eigenvectors the second and third column vectors of P, respec- 
tively. With Theorem 2 this idea is extended to n x n matrices. 


An n x n matrix A is diagonalizable if and only if A has n linearly independent 
eigenvectors. Moreover, if D = P~'AP, with D a diagonal matrix, then the diag- 
onal entries of D are the eigenvalues of A and the column vectors of P are the 
corresponding eigenvectors. 


Proof First suppose that A has n linearly independent eigenvectors 
V1, V2,---,V,, Corresponding to the eigenvalues ^1, 2,...,,. Note that the 


5.2 Diagonalization 289 


eigenvalues may not all be distinct. Let 


Pil P12 Pin 

P21 P22 P2n 
v= : V2 = : seis Vn = . 

Pn1 Pn2 Pnn 


and define the n x n matrix P so that the ith column vector is v;. Since the column 
vectors of P are linearly independent, by Theorem 9 of Sec. 2.3 the matrix P is 
invertible. Next, since the ith column vector of the product AP is 


AP; = AV; = ivi 


we have 
Mipit MPi2 «+» An Pin 
Aipa MPa «+. Anpan 
AP = 3 k i i 
A1 pPni A2 Pn2 tee Xn Pnn 
Pit Pi2 ... Pin hy 0 ... 0 
Pa P2 +++ Pan 0 wm... 0 
Pni Pn2 +++ Pm 0 0 Hels dn 
= PD 


where D is a diagonal matrix with diagonal entries the eigenvalues of A. So 
AP = PD and multiplying both sides on the left by P7! gives 
PAP =p 


The matrix A is similar to a diagonal matrix and is therefore diagonalizable. 
Conversely, suppose that A is diagonalizable, that is, a diagonal matrix D and 
an invertible matrix P exist such that 


D = P'AP 
As above, denote the column vectors of the matrix P by vj, V2,...,V, and the 


diagonal entries of D by ij, ^2, ..., An. Since AP = PD, for each i = 1,...,n, 
we have 


AV; = iv; 
Hence, v1, V2,..., Vn are eigenvectors of A. Since P is invertible, then by Theorem 
9 of Sec. 2.3 the vectors v1, V2,..., Vn are linearly independent. 


Use Theorem 2 to diagonalize the matrix 


290 Chapter 5 Eigenvalues and Eigenvectors 


Solution Since A is a triangular matrix, by Proposition 1 of Sec. 5.1, the eigenvalues of the 
matrix A are the diagonal entries 


i= Il A2 = —2 and he = 2 


The corresponding eigenvectors, which are linearly independent, are given, respec- 


tively, by 

1 0 0 
“=| 2 we |} ll and v3= ] 0 
1 1 1 

Therefore, by Theorem 2, D = P'AP, where 
1 0 0 1 0 0 
D=|0 -2 0 and P—=| 2 i © 
0 0 2 i i i 


To verify that D = P~'AP, we can avoid finding P~! by showing that 


PD=AP 
In this case, 
1 0 0 1 0 0 1 0 0 
PPD =I 2 i @ ee Oi | 2 =—2 0 
eel! al o 02 1 -2 2 
1 0 0 1 0 0 
=| © =2 0 2 1 0 
7 —4 2 Li oleae 
= AP 
01 1 -1 1 0 
A = | O il and i= 0 -1 1 
1 1 0 0 oO 2 
Show that A is diagonalizable but that B is not diagonalizable. 
Solution To find the eigenvalues of A, we solve the characteristic equation 
—h 1 1 
det(A — XJ) = det Il ==), 1 
1 IL =A 
=-(W+1)°(—2) =0 
Thus, the eigenvalues of A are }; = —1, with algebraic multiplicity 2, and ^2 = 2, 


with algebraic multiplicity 1. To find the eigenvectors, we find the null space of 


5.2 Diagonalization 291 


A — XI for each eigenvalue. For }; = —1 we reduce the matrix 
ee il 1 1 
1 1 to 0 0 0 
i il 0 0 0 
Hence, 
—1 -1 
N(A + I) = span i ||, 0 
0 1 
In a similar manner we find 
1 
N(A — 2I) = span 1 
1 
-1 -1 1 
Since the three vectors i |. O |, and | 1 | are linearly independent, 
(0) 1 1 


by Theorem 2 the matrix A is diagonalizable. 
Using the same approach, we find that B has the same characteristic polynomial 
and hence the same eigenvalues. However, in this case 


1 1 
N(B + I) = span 0 and N(B — 2I) = span 3 
0 9 


Since B does not have three linearly independent eigenvectors, by Theorem 2, B 
is not diagonalizable. 


The matrix P that diagonalizes an n x n matrix A is not unique. For example, if 
the columns of P are permuted, then the resulting matrix also diagonalizes A. As an 
illustration, the matrix A of Example 3 is diagonalized by 


af) —1 1 -1 00 
P= 1 0 1 with P'AP = 0 -1 0 
0O 1 1 0 02 


However, if Q is the matrix obtained from interchanging columns 2 and 3 of P, then 
Q also diagonalizes A, with 


-1 0 0 
Q'AQ= 02 0 
0 0 -i 


Notice, in this case, that the second and third diagonal entries are also interchanged. 
Theorem 3 gives sufficient conditions for a matrix to be diagonalizable. 


292 


Chapter 5 Eigenvalues and Eigenvectors 


THEOREM 3 


Let A be an n Xn matrix, and let ^1, d2,..., An be distinct eigenvalues with 
corresponding eigenvectors V1, V2,..., V,- Then the set {v1, V2,..., Vn} is linearly 
independent. 


Proof The proof is by contradiction. Assume that 1, M2, ..., An are distinct 
eigenvalues of A with corresponding eigenvectors V1, V2,..., Vn, and assume that 
the set of eigenvectors is linearly dependent. Then by Theorem 5 of Sec. 2.3, 
at least one of the vectors can be written as a linear combination of the others. 
Moreover, the eigenvectors can be reordered so that v1, V2,..., Vm, with m <n, 
are linearly independent, but v1, V2,..., Vm+1 are linearly dependent with v,,41 a 
nontrivial linear combination of the first m vectors. Therefore, there are scalars 
C1, ...-, Cm, not all 0, such that 


Vm+1 = C1V1 +- + CmYÝm 


This is the statement that will result in a contradiction. We multiply the last equation 
by A to obtain 


AVYm+1 = A(c1V1 es a CmVin) 
= c1 A(V1) + +++ + CmA(Vm) 


Further, since v; is an eigenvector corresponding to the eigenvalue ;, then Av; = 
Xiv;, and after substitution in the previous equation, we have 


Am+1Ym+1 = CLAVE + + Cm din Vin 
Now multiplying both sides of Vm+1 = C1V1 +°- + CmYm by Nm+1, we also have 
Amt1Vm+1 = CrAm41V1 F +e Cm Am1 Ym 
By equating the last two expressions for \m41Vm+1 Wwe obtain 
CINW +t + CmdnVin = CiNm41V1 +++ + Cm din Vin 
or equivalently, 


ci = Am+1)V1 ee Cm in = Am+1)Ym =0 


Since the vectors vj, V2,..., Vm are linearly independent, the only solution to the 
previous equation is the trivial solution, that is, 
cii — Am4) =O C22 — m41) = 0 ga Cmm — Am1) = 0 
Since all the eigenvalues are distinct, we have 
Ai — Am1 Æ O ho — Ame £ 0 ties Am — Anti £ 0 


and consequently 
c =0 cQ =0 Se Cm = 0 


This contradicts the assumption that the nonzero vector Vm+1 is a nontrivial linear 
combination of v1, V2,..., Vm- 


5.2 Diagonalization 293 


COROLLARY 1 If A is an n x n matrix with n distinct eigenvalues, then A is diagonalizable. 


| EXAMPLE4 | Show that every 2 x 2 real symmetric matrix is diagonalizable. 


Solution Recall that the matrix A is symmetric if and only if A = A’. Every 2 x 2 symmetric 
matrix has the form 
Te & 
| bd 


See Example 5 of Sec. 1.3. The eigenvalues are found by solving the characteristic 
equation 


det(A — XJ) = =X — (44a) +ad -b =0 


a-k b 
b d- 


By the quadratic formula, the eigenvalues are 


4 ee (a — d)? + 4b? 


Since the discriminant (a — d)* + 4b? > 0, the characteristic equation has either one 
or two real roots. If (a — d)? + 4b* = 0, then (a — d)? = 0 and b? = 0, which holds 
if and only if a=d and b=0. Hence, the matrix A is diagonal. If 
(a — d)? + 4b? > 0, then A has two distinct eigenvalues; so by Corollary 1, the 
matrix A is diagonalizable. 


By Theorem 2, if A is diagonalizable, then A is similar to a diagonal matrix 
whose eigenvalues are the same as the eigenvalues of A. In Theorem 4 we show that 
the same can be said about any two similar matrices. 


THEOREM 4 Let A and B be similar n x n matrices. Then A and B have the same eigenvalues. 


Proof Since A and B are similar matrices, there is an invertible matrix P such 
that B = P~'AP. Now 
det(B — XJ) = det(P7'AP — XJ) 
= det(PT! (AP — P(AJ))) 
= det(PT! (AP — XI P)) 
= det(P~'(A — XI) P) 


294 Chapter 5 Eigenvalues and Eigenvectors 


Applying Theorem 15 and Corollary 1 of Sec. 1.6, we have 
det(B — XJ) = det(P~') det(A — XJ) det(P) 
= det(P~!) det(P) det(A — XJ) 
= det(A — XJ) 


Since the characteristic polynomials of A and B are the same, their eigenvalues 
are equal. 


| _CEXAMPLE 5 Bs M v 
re | and Beles 


Verify that the matrices A and B = P~'AP have the same eigenvalues. 


Solution The characteristic equation for A is 
det(A — AI) = 1-A)GB- A) =0 
so the eigenvalues of A are ^; = 1 and 2 = 3. Since 
p=r'ar=| i i |[o ai 
the characteristic equation for B is 
det(B — XJ) = 1 — 2) 3-2) =0 


and hence, the eigenvalues of B are also ^; = 1 and M = 3. 


In Sec. 4.5, we saw that a linear operator on a finite dimensional vector space 
can have different matrix representations depending on the basis used to construct the 
matrix. However, in every case the action of the linear operator on a vector remains 
the same. These matrix representations also have the same eigenvalues. 


COROLLARY 2 Let V be a finite dimensional vector space, T: V —> V a linear operator, and Bı 
and B2 ordered bases for V. Then [T]g, and [T]g, have the same eigenvalues. 


Proof Let P be the transition matrix from Bz to By. Then by Theorem 15 of 
Sec. 4.5, P is invertible and [T]z, = P'[T]p, P. Therefore, by Theorem 4, [T ]g, 
and [T]g, have the same eigenvalues. 


THEOREM 5 


DEFINITION 1 


5.2 Diagonalization 295 


Recall that in Example 3, the characteristic polynomial for A and B is 
—(k+1)?(. — 2). For the matrix A the eigenspaces corresponding to i; = —1 
and ^2 = 2 are 


—1 -1 1 
V,, = span i 0 and V,, = span 1 
0 1 1 
whereas the eigenspaces for B are 
1 1 
V,, = span 0 and V,, = span 3 
0 9 


Notice that for the matrix A, we have dim(V,,) = 2 and dim(V,,,) = 1, which, 
respectively, are equal to the corresponding algebraic multiplicities in the charac- 
teristic polynomial. This is not the case for the matrix B, since dim(V,,) = 1 and 
the corresponding algebraic multiplicity is 2. Moreover, for A, we have dim(V,,) + 
dim(V,,) =3 =n. 

The general result describing this situation is given, without proof, in Theorem 5. 


Let A be an nxn matrix, and suppose that the characteristic polynomial is 
elx — M)” (x — Ag)® +++ (x — dy). The matrix A is diagonalizable if and only if 
di = dim(V,,), for each i = 1,..., k, and 


di +d +- -<+ dg = dim(VY,,) + dim(V;,) +--+ + dim(Yy,) =n 


To summarize Theorem 5, an n x n matrix A is diagonalizable if and only if the 
algebraic multiplicity for each eigenvalue is equal to the dimension of the correspond- 
ing eigenspace, which is the corresponding geometric multiplicity, and the common 
sum of these multiplicities is n. 


Diagonalizable Linear Operators 


In Theorem 12 of Sec. 4.4, we established that every linear operator on a finite 
dimensional vector space has a matrix representation. The particular matrix for the 
operator depends on the ordered basis used. From Corollary 2, we know that all 
matrix representations for a given linear operator are similar. This allows us to make 
the following definition. 


Diagonalizable Linear Operator Let V be a finite dimensional vector space 
and T: V — V a linear operator. The operator T is called diagonalizable if there 
is a basis B for V such that the matrix for T relative to B is a diagonal matrix. 


296 


Chapter 5 Eigenvalues and Eigenvectors 


Now suppose that V is a vector space of dimension n, T: V —> V a linear 
operator, and B = {v1, V2,..., Vn} a basis for V consisting of n eigenvectors. Then 


IT]z = T (vı) T (v2) <| TOn) 
B B B 
Since for each i = 1, ...,n the vector v; is an eigenvector, then T(v;) = 4;v;, where 


~i is the corresponding eigenvalue. Since for each i, the basis vector v; can be written 
uniquely as a linear combination of v1, ..., Vn, we have 


v;i = Ov; +--+ + Ov;-1 + Vi + Ovj41 +: + Ovn 


Then the coordinate vector of T (v;) relative to B is 


Therefore, [T ]g is a diagonal matrix. Alternatively, we can say that T is diagonalizable 
if there is a basis for V consisting of eigenvectors of T. As an illustration, define the 
linear operator T: R? — R? by 


Observe that 


s[i] e Hefe 


are eigenvectors of T with corresponding eigenvalues >; = 2 and ^2 = 1, respectively. 
Let B = {v1, Vo}, so that 


[T]s = [IT]; [Tv2]2] = | ae | 


is a diagonal matrix. 

In practice it is not always so easy to determine the eigenvalues and eigenvectors 
of T. However, if B is any basis for V such that [T ]g is diagonalizable with diagonal- 
izing matrix P, then T is diagonalizable. That is, if B’ is the basis consisting of the 
column vectors of P, then [T]g = P~'[T]gP is a diagonal matrix. This procedure 
is illustrated in Example 6. 


Solution 


5.2 Diagonalization 297 


Define the linear operator T: R? — R? by 


xX] 3x1 — x2 + 2x3 
T X2 = 2x1 + 2x3 
X3 X1 + 3x2 


Show that T is diagonalizable. 


Let B = {e}, e2, e3} be the standard basis for RÌ. Then the matrix for T relative to 
Bis 


3: a 
[Tla= | 
Meyi 


Observe that the eigenvalues of [T]g are Mı = —2, hy = 4, and 3 = 1 with cor- 
responding eigenvectors, respectively, 


1 1 —5 
Vi = 1 Wy = || 1 and y= 4 
—2 1 7 
Now let B’ = {v1, v2, v3} and 
I I 5 
P= Leal 4 
—2 1 
Then 
-1 4 -3 3 = 2 1 1 —5 —2 0 0 
[T]; = =- 5 3 2 0 2 1 1 aL || 0 4 0 
-1 1 0 1 3 0 —2 1 7 0 0 1 


Fact Summary 


Let A be an n x n matrix. 


1. If A is diagonalizable, then A = PDP! or equivalently D = P~'!AP. The 
matrix D is a diagonal matrix with diagonal entries the eigenvalues of A. 
The matrix P is invertible whose column vectors are the corresponding 
eigenvectors. 

2. If A is diagonalizable, then the diagonalizing matrix P is not unique. If the 
columns of P are permuted, then the diagonal entries of D are permuted in 
the same way. 

3. The matrix A is diagonalizable if and only if A has n linearly independent 
eigenvectors. 

4. If A has n distinct eigenvalues, then A is diagonalizable. 


298 Chapter 5 Eigenvalues and Eigenvectors 


5. Every 2 x 2 real symmetric matrix is diagonalizable and has real 
eigenvalues. 

6. Similar matrices have the same eigenvalues. 

7. If A is diagonalizable, then the algebraic multiplicity for each eigenvalue is 
equal to the dimension of the corresponding eigenspace (the geometric 
multiplicity). The common sum of these multiplicities is n. 


8. Let T: V — V be a linear operator on a finite dimensional vector 
space V. If V has an ordered basis B consisting of eigenvectors of T, then 
[T]g is a diagonal matrix. 

9. Let T: V — V be a linear operator and Bı and B2 ordered bases for V. 
Then [TJ], and [T]g, have the same eigenvalues. 


Exercise Set 5.2 


In Exercises 1—4, show that A is diagonalizable, using 


—1 1 
the matrix P. es | 0 -2 | 
1 (0) —2 0 
LA= | | - | | ee, 28 
2 -3 1 1 6. A= 2 2 
—1 1 
2. A= | | -1 —1 
oe ef 
1 -1 
p=| | -3 -2 
os 1 = 
8. A= | A | 
1 0 0 
3. A=|2 -2 0 9. A= 0 1 
0 2 0 0 1 
0 3 2 ng 
2 = 
P= | 0 -1 1 124 E E 
1 1 2 
2 2 0 
-1 2 2 11. A=|2 2 2 
4 A= 0 2 0 0 0 3 
2 -1 2 
—1 3 2 
1 -2 —2 12. A= | -1 2 3 
2 1 1 
In Exercises 5-18, find the eigenvalues, and if 2 
: : 13. A= 2 -l1 — 
necessary the corresponding eigenvectors, of A and 1 
determine whether A is diagonalizable. 7 


14. A= 


15. A= 


16. A= 


17. A= 


18. A= 


In Exercises 19—26, diagonalize the matrix A. 


21. 


22. 


23. 


24. 


25. 


c,h ee I S T. 


O = Ọ = 


i) 


2 
=i 
= 4 
1 2 
1 0 
0 -2 
i 29 
| 
0 2 
0 0 
—1 0 
=f 21 
0 0 
1 0 
0 0 
| 
1 0 
0 1 
1 0 
0 0 


- O m © 


O oOo 


or O = 


m- © 


Se eee 


26. 


27. 


29. 


30. 


31. 


32. 


33. 


34. 


35. 


5.2 Diagonalization 299 


SPR OF 


0 
1 
1 
1 


SPR OF 


1 
0 
1 
1 


Suppose A is diagonalizable with D = P~'AP. 
Show that for any positive integer k, 


A‘ = PD‘ Pp“! 


2 1 
eee 
Factor A in the form A = PDP™!, where D is a 
diagonal matrix. Then find A6. See Exercise 27. 


. Let 


Let 
3 —1 -2 
A= 2 0 —2 
2 -1 —1 


Factor A in the form A = PDP™!, where D is a 
diagonal matrix. Then find A‘, for any positive 
integer k. See Exercise 27. 


Suppose A is an n x n matrix that is diagonalized 
by P. Find a matrix that diagonalizes A’. 


Suppose A is an n x n matrix that is diago- 
nalizable. Show that if B is a matrix similar to A, 
then B is diagonalizable. 


Show that if A is invertible and diagonalizable, 
then AT! is diagonalizable. Find a 2 x 2 matrix 
that is not a diagonal matrix, is not invertible, but 
is diagonalizable. 


Suppose A is an n x n matrix and ^ is an 
eigenvalue of multiplicity n. Show that A is 
diagonalizable if and only if A = 1. 


Ann x n matrix A is called nilpotent if there is a 
positive integer k such that A‘ = 0. Show that a 
nonzero nilpotent matrix is not diagonalizable. 


Define a linear operator T: P2 > P2 by 
T (p(x)) = p(x) 


a. Find the matrix A for T relative to the 
standard basis {1, x, x}. 


300 Chapter 5 Eigenvalues and Eigenvectors 


b. Find the matrix B for T relative to the basis 
{x,x — 1, x7}. 


c. Show the eigenvalues of A and B are the same. 


d. Explain why T is not diagonalizable. 


36. Define a vector space V = span{sin x, cos x} and 
a linear operator T: V > V by T(f(x)) = f'(x). 
Show that T is diagonalizable. 


37. Define a linear operator T: R? > R? by 


38. 


39. 


Show that T is not diagonalizable. 


Define a linear operator T: R? —> R? by 


X1 4x, + 2x2 + 4x3 
T x2 = | 4x, + 2x2 + 4x3 
X3 4x3 


Show that T is diagonalizable. 


Let T be a linear operator on a finite dimensional 
vector space, A the matrix for T relative to a 


x1 2x1 + 2x2 + 2x3 basis Bı, and B the matrix for T relative to a 
T x2 = —x, + 2x2 + x3 basis B2. Show that A is diagonalizable if and 
X3 xX} — X2 only if B is diagonalizable. 


5.3 > Application: Systems of Linear Differential 


Equations 


In Sec. 3.5 we considered only a single differential equation where the solution 
involved a single function. However, in many modeling applications, an equation 
that involves the derivatives of only one function is not sufficient. It is more likely 
that the rate of change of a variable quantity will be linked to other functions outside 
itself. This is the fundamental idea behind the notion of a dynamical system. One of 
the most familiar examples of this is the predator-prey model. For example, suppose 
we wish to create a model to predict the number of foxes and rabbits in some habitat. 
The growth rate of the foxes is dependent on not only the number of foxes but also the 
number of rabbits in their territory. Likewise, the growth rate of the rabbit population 
in part is dependent on their current number, but is obviously mitigated by the number 
of foxes in their midst. The mathematical model required to describe this relationship 
is a system of differential equations of the form 


yy (t) = f(t, 1, y2) 
XO = g(t, yi, y2) 


In this section we consider systems of linear differential equations. Problems such as 
predator-prey problems involve systems of nonlinear differential equations. 


Uncoupled Systems 
At the beginning of Sec. 3.5 we saw that the differential equation given by 
y =ay 


has the solution y(t) = Ce’, where C = y(0). An extension of this to two dimensions 
is the system of differential equations 


yy =ay 
yy = byr 


5.3 Application: Systems of Linear Differential Equations 301 


where a and b are constants and yı and yp are functions of a common variable t. This 
system is called uncoupled since y; and y, depend only on yı and y2, respectively. 
The general solution of the system is found by solving each equation separately and 
is given by 
yi(t)=Cye" and ~—ya(t) = Cze” 

where Cı = yı (0) and C2 = y2(0). 

The previous system of two differential equations can also be written in matrix 
form. To do this, define 


; yi a 0 yı 
v-[x] [ee] om sfa] 


Then the uncoupled system above is equivalent to the matrix equation 
/ 
y= ay 


The matrix form of the solution is given by 


at 


=| ow po 


where y(0) = | 


As an illustration, consider the system of differential equations 


y =y 
yy = 2y2 


In matrix form the system is written as 


The solution to the system is 
e“ 0 
y= | ia [bo 
that is, 
yi) =yiOe* and y(t) = y20)e” 


The Phase Plane 


In the case of a single differential equation, it is possible to sketch particular solutions 
in the plane to see explicitly how y(t) depends on the independent variable t. However, 
for a system of two differential equations, the solutions are vectors which depend on 
a common parameter t, which is usually time. A particular solution can be viewed as 
a parameterized curve or trajectory in the plane, called the phase plane. Shown in 
Fig. | are trajectories for several particular solutions of the system 


y =y 
y = 2y2 


302 


Chapter 5 Eigenvalues and Eigenvectors 


The vectors shown in Fig. 1 comprise the direction field for the system and 
describe the motion along a trajectory for increasing t. This sketch is called the phase 
portrait for the system. Phase portraits are usually drawn without the direction field. 
We have done so here to give a more complete picture of the system and its solutions. 


Diagonalization 


In the previous example, the matrix A is diagonal, as this is the case for any uncoupled 
system of differential equations. We now consider more general systems of the form 


y = Ay 


for which A is not a diagonal matrix, but is diagonalizable with real distinct eigen- 
values. To solve problems of this type, our strategy is to reduce the system y = Ay 
to one that is uncoupled. 

To develop this idea, let A be a 2 x 2 diagonalizable matrix with distinct real 
eigenvalues. Consider the system of differential equations given by 


y = Ay 


Since A is diagonalizable, then by Theorem 2 of Sec. 5.2 there is a diagonal matrix 
D and an invertible matrix P such that 


D=P "AP 
The diagonal matrix D is given by 
{| ma 0 
>= (0 a] 


where ^; and ^z are the eigenvalues of A. The column vectors of P are the corre- 
sponding eigenvectors. To uncouple the system y’ = Ay, let 


w=P y 


5.3 Application: Systems of Linear Differential Equations 303 


Differentiating both sides of the last equation gives 
w =(P yy = Poly 
= P`! Ay 
= P'(PDP')y = (P7'P)(DP™')y 
=DP'y 
= Dw 


Since D is a diagonal matrix, the original linear system y’ = Ay is transformed into 
the uncoupled linear system 


w = P-'APw= Dw 


The general solution of this new system is given by 


ert 


vo=| n par [WO 


e 


Now, to find the solution to the original system, we again use the substitution w = 
P~'y to obtain 
hit 
= e 0 Š 
P—'y() = | j: Dat P—'y(0) 
Hence, the solution to the original system is 


yt 
y(t) =P | e : a | P-y(0) 


| «EXAMPLE 1 | Find the general solution to the system of differential equations 


yi TAA 
y = 3y1 +2y2 
Sketch several trajectories in the phase plane. 


Solution The differential equation is given in matrix form by 
=| 0 
y = Ay= | 3 9 ly 


After solving the characteristic equation det(A — J) = 0, we know that the eigen- 
values of A are }; = —1 and ^2 = 2 with corresponding eigenvectors 


v=[ 4] mt nel] 


Hence, the matrix P which diagonalizes A (see Theorem 2 of Sec. 5.2) is 


= 1 0 i a || O 
Bal | | with P TO 


304 Chapter 5 Eigenvalues and Eigenvectors 


The related uncoupled system is then given by 


w = P-'APw 
_fi 0) te 1 0 
= | 3 EE a 
_ |e 
oal” 


whose general solution is 


= 


"o | a [vo 


Hence, the solution to the original system is given by 


1 0 z 0 1 0 
yo=| 4 ails ME 1 ho 


cm 0 
= —e™ pe” e” bo 


The general solution can also be written in the form 
yE) =y 0e" and y(t) = —y, 0)e™ + [yı (0) + y2(0)] e” 
The phase portrait is shown in Fig. 2. The signs of the eigenvalues and the 


direction of the corresponding eigenvectors help to provide qualitative information 
about the trajectories in the phase portrait. In particular, notice in Fig. 2 that along 


the line spanned by the eigenvector vı = E the flow is directed toward the 
origin. This is so because the sign of ^; = —1 is negative. On the other hand, flow 


is away from the origin, since in this case 


along the line spanned by v2 = | 1 


2 = 2 is positive. 


a 


5.3 Application: Systems of Linear Differential Equations 305 


In Example 2 we describe the solution for a system when the eigenvalues have 
the same sign. 


| EXAMPLE2 | Find the general solution to the system of differential equations 


yi =y +3» 
ho = 2y2 


Solution The system of differential equations is given in matrix form by 
1 3 
y = Ay= | 02 | y 


The eigenvalues of A are }; = 1 and ^2 = 2 with corresponding eigenvectors 


[2] me eef 


The matrix that diagonalizes A is then 


is ; =i TE =3 
oo with P =| i 


The uncoupled system is given by 


with general solution 
e 0 
wn =| 0 et [vo 


Hence, the solution to the original system is given by 


1 X 1 -3 
o=] 1 || 9 eel em 


e —3e! + 3¢7! 
| 0 et |x 


The general solution can also be written in the form 


yi(t) = [y1) — 3y2(0)] e + 3y2(0)e* and ~—yn(t) = yn (Oe 


306 Chapter 5 Eigenvalues and Eigenvectors 


Solution 


The phase portrait is shown in Fig. 3. For this example, since ^; and 2 are both 
positive, the flow is oriented outward along the lines spanned by vı and v2. 


Figure 3 


The process described for solving a system of two equations can be extended to 
higher dimensions provided that the matrix A of the system is diagonalizable. 


Find the general solution to the system of differential equations 


yi = Sul 
yy = 2yn+y2 
ys = 4y + y2 4+ 2y3 


The system of differential equations in matrix form is 


-1 0 0 
y = Ay= 21 Oly 
A l 2 
Since A is triangular, the eigenvalues of A are the diagonal entries }; = —1, kz = 1, 
and 43 = 2 with corresponding eigenvectors 
—1 0 0 
VE 1 vV = 1 and v3= | 0 
1 —1 2 


respectively. Since A is a 3 x 3 matrix with three distinct eigenvalues, by Corollary 
1 of Sec. 5.2, A is diagonalizable. Now, by Theorem 2 of Sec. 5.2, the diagonalizing 
matrix is given by 


De 

ll 
= =- O 
NOOO 

Z 

= 

=r 

v 

ll 
NI= eS © 
SS) (SS) 


5.3 Application: Systems of Linear Differential Equations 307 


The related uncoupled system then becomes 


-1 0 0 =~ 0 0 = 0 0 
w= 1 1 0 2 1 0 1 Ojw 
1 5 l 4 1 2 -1 2 
—1 0 0 
= 0 1 O}]w 
0 0 2 
with general solution 
a 0 0 
w(t) = 0 e O | w(0) 
0 0 
Hence, the solution to the original system is given by 
= 0 0 e” 0 0 -1 0 0 
y(t) = 1 1 0 0 e 0 1 1 O |y) 
1 -1 2 0 0 e” 5 3 
et 0 0 
= -—e'+e!' e 0 | y0) 


The general solution can also be written in the form 


yı) = yı 0)e y2(t) = — y1 Oe + [y1 (0) + y2(0)]e" and 
yst) = —y1 (Oe — [y1 (0) + y2)]e’ + [2y1 (0) + y2) + y3O)]e*. 


Example 4 gives an illustration of how a linear system of differential equations 
can be used to model the concentration of salt in two interconnected tanks. 


| EXAMPLE4 | Suppose that two brine storage tanks are connected with two pipes used to exchange 


solutions between them. The first pipe allows water from tank 1 to enter tank 2 at 
a rate of 5 gal/min. The second pipe reverses the process allowing water to flow 


from tank 2 to tank 1, also at a rate of 5 gal/min. Initially, the first tank contains 
a well-mixed solution of 8 lb of salt in 50 gal of water, while the second tank 
contains 100 gal of pure water. 
Figure 4 l : ; : : 
a. Find the linear system of differential equations to describe the amount of salt 


in each tank at time f. 
b. Solve the system of equations by reducing it to an uncoupled system. 


308 


Chapter 5 Eigenvalues and Eigenvectors 


Solution 


c. 


a. 


Determine the amount of salt in each tank as ¢ increases to infinity and explain 
the result. 


Let y;(t) and y2(t) be the amount of salt (in pounds) in each tank after ¢ min. 
Thus, y;(t) and y(t) are, respectively, the rates of change for the amount of 
salt in tank 1 and tank 2. To develop a system of equations, note that for each 
tank 


Rate of change of salt = rate in — rate out 


Since the volume of brine in each tank remains constant, for tank 1, the rate in 


is To y2 (t) while the rate out is 5 yı (t). For tank 2, the rate in is x yı (t) while 


the rate out is wa y(t). The system of differential equations is then given by 


we SSO) l E yilt) + 35 DAO) 
that is, 


YO= Fy) - w 2O) ys) = prO- O 


Since the initial amounts of salt in tank | and tank 2 are 8 and 0 lb, respectively, 
the initial conditions on the system are y; (0) = 8 and y2(0) = 0. 


. The system of equations in matrix form is given by 


1 
=m 3 8 
y= | PD È with y0) = | a | 


The eigenvalues of the matrix are ^1 = -5 and 2 = 0 with corresponding 


eigenvectors | e | and | : | Thus, the matrix that uncouples the system 


1S 


1 =e 
ee : -1_ 3.3 
=| ol with P =| i l 
3 3 
The uncoupled system is then given by 
> il 1 1 
Ra O S =W 30 -1 1 
oe, 3 3 10 20 
-l a se | 
3 3 10 20 


The solution to the uncoupled system is 


aor 
e 0 O 


w= | 0 1 


| w(0) 


c. 


Exercise Set 5.3 


5.3 Application: Systems of Linear Differential Equations 309 


Hence, the solution to the original system is given by 


_[-1 aie olle S 
3 3 

UPA n =a 8 

mae +2) eanl 

__ 8 | a re 

- 3 | aoaaa 


The solution to the system in equation form is given by 


vit) = ; (275 + 1) and y(t) = ; (26-8 Ea 2) 


To find the amount of salt in each tank as t goes to infinity, we compute the 
limits 


8 3 8 8 
i = Say! = = — 
Jim = (2e = 1) 30+)=5 
and 
8 3 8 16 
i | e = — = — 
Jim 5 ( 2e +2) Ora 3 


These values make sense intuitively as we expect that the 8 lb of salt should 
eventually be thoroughly mixed, and divided proportionally between the two 
tanks in a ratio of 1:2. 


In Exercises 1—6, find the general solution to the 


system of differential equations. 


1 yi =-yit+ yo 
yy = — 2y2 

2 yi =—y1 +2y2 

yy = Al 

ee = ey 

yy =—3yi+ ye 
/ — 

4. a yı — y2 

yy =-yYity2 


yi =—4y1 — 3y2 — 3y3 


5 yy = 2y1 + 3y2 + 2y3 
ys = 4y +292 +3y3 
yy = —3y1- 4y2- 4y3 

6 4 yy = Ty +1ly + 13y3 
y4 =—Sy,— 8y2 — 10y3 


In Exercises 7 and 8, solve the initial-value problem. 


7 yo =y yi(0) = 1 y2(0) = -1 
yy = 2y +y 


310 Chapter 5 Eigenvalues and Eigenvectors 


= 5y, — 12y) + 20y3 Suppose the rates of heat flow between the 
S25) =4y— 9y + 16 upstairs, downstairs, and outside are as shown in 
F ms e 73 the figure. Further suppose the temperature of the 
Ys =2y1— 4y + Ty first floor is 70°F and that of the second floor is 
y,(0) =2 y2(0) = —1 y3(0) = 0 60°F when the furnace fails. 


9. Suppose that two brine storage tanks are 
connected with two pipes used to exchange 
solutions between them. The first pipe allows 


0.2 0.1 
water from tank 1 to enter tank 2 at a rate of 
1 gal/min. The second pipe reverses the process, Pe a 


allowing water to flow from tank 2 to tank 1, also e 95 
at a rate of 1 gal/min. Initially, the first tank 1 2 
contains a well-mixed solution of 12 lb of salt in 0.5 


60 gal of water, while the second tank contains 
120 gal of pure water. 


a. Find the linear system of differential equations a. Use the balance law 
to describe the amount of salt in each tank at f 
ümmet. Net rate of change = rate in — rate out 


b. Solve the system of equations by reducing it to 


to set up an initial-value problem to model the 
an uncoupled system. 


heat flow. 
c. Determine the amount of salt in each tank as t 
increases to infinity and explain the result. b. Solve the initial-value problem found in 
10. On a cold winter night when the outside Bata 
temperature is 0 degrees Fahrenheit (0°F) at c. Compute how long it takes for each floor to 
9:00 p.m. the furnace in a two-story home fails. reach 32°F. 


| 5.4 > Application: Markov Chains 


In probability theory a Markov process refers to a type of mathematical model used 
to analyze a sequence of random events. A critical factor when computing the proba- 
bilities of a succession of events is whether the events are dependent on one another. 
For example, each toss of a fair coin is an independent event as the coin has no mem- 
ory of a previous toss. A Markov process is useful in describing the tendencies of 
conditionally dependent random events, where the likelihood of each event depends 
on what happened previously. 

As an illustration, we consider a simple weather model based on the two obser- 


vations that 
Table 1 


amy emey 1. If today is sunny, then there is a 70 percent chance that tomorrow will be sunny. 


2. If today is cloudy, then there is a 50 percent chance that tomorrow will be cloudy. 


The conditional probabilities for the weather tomorrow, given the weather for 
Cloudy today, are given in Table 1. 


The column headings in Table 1 describe today’s weather, and the row headings 
the weather for tomorrow. For example, the probability that a sunny day today is 


5.4 Application: Markov Chains 311 


followed by another sunny day tomorrow is 0.7, and the probability that a sunny day 
today is followed by a cloudy day tomorrow is 0.3. Notice that the column sums are 
both 1, since, for example, it is certain that a sunny day today is followed by either 
a sunny day or a cloudy day tomorrow. 

In a Markov process, these observations are applied iteratively, giving us the 
ability to entertain questions such as, If today is sunny, what is the probability that it 
will be sunny one week from today? 


State Vectors and Transition Matrices 


To develop the Markov process required to make predictions about the weather using 
. , v 

the observations above, we start with a vector v = | ae whose components are the 
2 


probabilities for the current weather conditions. In particular, let v; be the probability 
that today is sunny and v the probability that today is cloudy. Each day the com- 
ponents of v change in accordance with the probabilities, listed in Table 1, giving us 
the current state of the weather. In a Markov process, the vector v is called a state 


vector, and a sequence of state vectors a Markov chain. Using Table 1, the state 
1 


v 
vector Vv’ = | a for the weather tomorrow has components 
2 


vi = 0.7v; + 0.5v2 and vs = 0.30; + 0.5v2 


That is, the probability vi of a sunny day tomorrow is 0.7 times the probability of a 
sunny day today plus 0.5 times the probability of a cloudy day today. Likewise, the 
probability v5 of a cloudy day tomorrow is 0.3 times the probability of a sunny day 
today plus 0.5 times the probability of a cloudy day today. For example, if today is 
sunny, then vı = | and v = 0 so that 


vi = 0.71) + 0.5(0) = 0.7 and vy = 0.3(1) + 0.500) = 0.3 


which is in agreement with the observations above. Observe that if we let T be the 
matrix 


then the relationship above between v and v’ can be written using matrix multiplica- 


tion as 
vi l 107 0.5 VI 
v F 0.3 0.5 v2 


In a Markov chain, the matrix used to move from one state to the next is called 
the transition matrix. If n is the number of possible states, then the transition matrix 
T is an n x n matrix where the ij entry is the probability of moving from state j 
to state i. In the above example fz = 0.5 gives the probability that a cloudy day 
is followed by one that is sunny. A vector with positive entries whose sum is 1 is 
called a probability vector. A matrix whose column vectors are probability vectors 
is called a stochastic matrix. The transition matrix T given above is an example of 
a stochastic matrix. 


312 Chapter 5 Eigenvalues and Eigenvectors 


Returning to the weather example, to predict the weather 2 days forward, we 
apply the transition matrix T to the vector v’ so that 


vi | _ | 0.7 0.5 vi 
vy | | 0.3 0.5 v5 
[07 05) fv] _ [064 0.60)]f » 
~ | 0.3 0.5 v2 | | 0.36 0.40 v2 
Thus, for example, if today is sunny, the state vector for the weather 2 days from now 
is given by 
vi | [ 0.64 0.60 ]f1]_ T 0.64 
vy ~ | 0.36 0.40 0} | 0.36 
In general, after n days the state vector for the weather is given by 
n | OF OS" |i 
iea d Ea 


To answer the question posed earlier about the weather one week after a sunny day, 
we compute 


07 0.5] [1 _ | 0.625 0.625 1 | | 0.625 

0.3 0.5 0 | | 0.375 0.375 0} | 0.375 
That is, if today is sunny, then the probability that it will be sunny one week after 
today is 0.625, and the probability it will be cloudy is 0.375. 


—— | 


Diagonalizing the Transition Matrix 


As we have just seen, determining future states in a Markov process involves comput- 
ing powers of the transition matrix. To facilitate the computations, we use the methods 
of Sec. 5.2 to diagonalize the transition matrix. To illustrate, we again consider the 
transition matrix 

r= | 


of the weather example above. Observe that T has distinct eigenvalues given by 


ale s~ 
al Sle 
r) 


a= 1 and A2 = 


sls 


with corresponding eigenvectors 


3 —1 
s and v= d 


5.4 Application: Markov Chains 313 


For reasons that will soon be clear, we scale vı (by the reciprocal of the sum of its 
components) so that it becomes a probability vector. Observe that this new vector 


[i] 


is also an eigenvector since it is in the eigenspace V,,. Since the 2 x 2 transition 
matrix has two distinct eigenvalues, by Corollary 1 of Sec. 5.2, T is diagonalizable 
and, by Theorem 2 of Sec. 5.2, can be written as 


Cole col 


T= PDP"! 


EE 


By Exercise 27 of Sec. 5.2, the powers of T are given by 


OOU p 
— 


Co] col 


” oO 
Pa Ppt SR! ñ ki 
o (5) 


As mentioned above, this gives us an easier way to compute the state vector for 
large values of n. Another benefit from this representation is that the matrix D” 


approaches 
1 0 
0 0 


as n gets large. This suggests that the eigenvector corresponding to ^ = | is useful in 
determining the limiting proportion of sunny days to cloudy days far into the future. 


Steady-State Vector 


Given an initial state vector v, of interest is the long-run behavior of this vector in a 
Markov chain, that is, the tendency of the vector T”v for large n. If for any initial 
state vector v there is some vector s such that T”v approaches s, then s is called a 
steady-state vector for the Markov process. 

In our weather model we saw that the transition matrix T has an eigenvalue ^ = 1 
and a corresponding probability eigenvector given by 


vel ely 


Col colin 


314 Chapter 5 Eigenvalues and Eigenvectors 


We claim that this vector is a steady-state vector for the weather model. As verification, 


0.6 
RE | 0.6250000002 


let u be an initial probability vector, say, u = . We then compute 


and 


~ | 0.3750000046 


704 — | 9:6249999954 
= | 0.3750000002 


which suggests that T”u approaches V;. That this is in fact the case is stated in 
Theorem 6. Before doing so, we note that a regular transition matrix T is a transition 
matrix such that for some n, all the entries of T” are positive. 


THEOREM 6 If a Markov chain has a regular stochastic transition matrix T, then there is a 
unique probability vector s with Ts = s. Moreover, s is the steady-state vector for 
any initial probability vector. 


| EXAMPLE1 | A group insurance plan allows three different options for participants, plan A, B, 


or C. Suppose that the percentages of the total number of participants enrolled in 
each plan are 25 percent, 30 percent, and 45 percent, respectively. Also, from past 
experience assume that participants change plans as shown in the table. 


a. Find the percent of participants enrolled in each plan after 5 years. 
b. Find the steady-state vector for the system. 


Solution Let T be the matrix given by 


0.75 0.25 0.2 
T= 0.15 045 0.4 
0.1 O03 0.4 


a. The number of participants enrolled in each plan after 5 years is approximated 
by the vector 


0.49776 0.46048 0.45608 0.25 0.47 
T°v = | 0.28464 0.30432 0.30664 0.30 | = | 0.30 
0.21760 0.23520 0.23728 0.45 0.22 


so approximately 47 percent will be enrolled in plan A, 30 percent in plan B, 
and 22 percent in plan C. 

b. The steady-state vector for the system is the probability eigenvector corre- 
sponding to the eigenvalue ^ = 1, that is, 


0.48 


s= | 0.30 
0.22 


Exercise Set 5.4 


1. Each year it is estimated that 15 percent of the 


population in a city moves to the surrounding 

suburbs and 8 percent of people living in the 

suburbs move to the city. Currently, the total 

population of the city and surrounding suburbs is 

2 million people with 1.4 million living in the 

city. 

a. Write the transition matrix for the Markov 
chain describing the migration pattern. 

b. Compute the expected population after 10 
years. 


c. Find the steady-state probability vector. 


. After opening a new mass transit system, the 
transit authority studied the user patterns to try to 
determine the number of people who switched 
from using an automobile to the system. They 
estimated that each year 30 percent of those who 
tried the mass transit system decided to go back 
to driving and 20 percent switched from driving 
to using mass transit. Suppose that the population 
remains constant and that initially 35 percent of 
the commuters use mass transit. 
a. Write the transition matrix for the Markov 
chain describing the system. 
b. Compute the expected number of commuters 
who will be using the mass transit system in 2 
years. In 5 years. 


c. Find the steady-state probability vector. 


. A plant blooms with red, pink, or white flowers. 
When a variety with red flowers is cross-bred 
with another variety, the probabilities of the new 
plant having red, pink, or white flowers are given 
in the table. 


R IP W 
ros fofo 
P foafoa foz 


pr [or] oafor 


Suppose initially there are only plants with pink 
flowers which are bred with other varieties with 


5.4 Application: Markov Chains 315 


the same likelihood. Find the probabilities of each 


variety occurring after three generations. After 10 
generations. 


. A fleet of taxis picks up and delivers commuters 


between two nearby cities A and B and the 
surrounding suburbs S. The probability of a driver 
picking up a passenger in location X and 
delivering the passenger to location Y is given in 
the table. The taxi company is interested in 
knowing on average where the taxis are. 


Ep 
ra [os] 03 fos 


ra [or] osfoa 
slef 


a. If a taxi is in city A, what is the probability it 
will be in location S after three fares? 


b. Suppose 30 percent of the taxis are in city A, 


35 percent are in city B, and 35 percent are in 
the suburbs. Calculate the probability of a taxi 
being in location A, B, or S after five fares. 


c. Find the steady-state probability vector. 


. An endemic disease that has reached epidemic 


proportions takes the lives of one-quarter of those 
who are ill each month while one-half of those 
who are healthy become ill. Determine whether 
the epidemic will be eradicated. If so, estimate 
how long it will take. 


. A regional study of smokers revealed that from 


one year to the next 55 percent of smokers quit 
while 20 percent of nonsmokers either became 
new smokers or started smoking again. If 70 
percent of the population are smokers, what 
fraction will be smoking in 5 years? In 10 years? 
In the long run? 


. A frog is confined to sitting on one of four lily 


pads. The pads are arranged in a square. Label the 
corners of the square A, B, C, and D clockwise. 
Each time the frog jumps, the probability of 


Chapter 5 Eigenvalues and Eigenvectors 


jumping to an adjacent pad is 1/4, the probability 

of jumping to the diagonal pad is 1/6, and the 

probability of landing on the same pad is 1/3. 

a. Write the transition matrix for the Markov 
process. 

b. Find the probability state vector after the frog 
has made n jumps starting at pad A. 

c. Find the steady-state vector. 


8. Let the transition matrix for a Markov process be 


0 1 
r=[1 0 
a. Find the eigenvalues of T. 
b. Find T” for n > 1. Use T” to explain why the 
Markov process does have a steady-state 
vector. 


c. Suppose T is the transition matrix describing 
the population distribution at any time for a 


constant population where residents can move 
between two locations. Describe the interaction 
in the population. 


9. Show that for all p and q such that 0 < p < 1 


and 0 < q < 1, the transition matrix 


ees , | 
p l-q 


has steady-state probability vector 


ere 
PT4 
apn 
PT4 


10. Suppose the transition matrix T for a Markov 


process is a 2 x 2 stochastic matrix that is also 
symmetric. 


a. Find the eigenvalues for the matrix T. 


b. Find the steady-state probability vector for the 
Markov process. 


Review Exercises for Chapter 5 


1. Let 


_|a b 
~ | boa 
for some real numbers a and b. 


a. Show that is an eigenvector of A. 


1 
1 
b. Find the eigenvalues of A. 

c. Find the eigenvectors corresponding to each 
eigenvalue found in part (b). 

d. Diagonalize the matrix A, using the 
eigenvectors found in part (b). That is, find the 
matrix P such that P~'AP is a diagonal 
matrix. Specify the diagonal matrix. 

. Let 

00 2 

A=]0 2 O 

0 0 -I 

a. Find the eigenvalues of A. 

b. From your result in part (a) can you conclude 
whether A is diagonalizable? Explain. 


c. Find the eigenvectors corresponding to each 
eigenvalue. 


d. Are the eigenvectors found in part (c) linearly 
independent? Explain. 

e. From your result in part (c) can you conclude 
whether A is diagonalizable? Explain. 

f. If your answer to part (e) is yes, find a matrix 
P that diagonalizes A. Specify the diagonal 
matrix D such that D = P~!AP. 


3. Repeat Exercise 2 with 


1 0 0 
1 0 
a 0 0 


RK Orr 


1 
0 
1 0 0 


4. Let T be a linear operator on a finite dimensional 


vector space with a matrix representation 


1 0 0 
A= 6 3 2 
—3 -1 0 


a. Find the characteristic polynomial for A. 
b. Find the eigenvalues of A. 
c. Find the dimension of each eigenspace of A. 


d. Using part (c), explain why the operator T is 
diagonalizable. 

e. Find a matrix P and diagonal matrix D such 
that D = P~'AP. 

f. Find two other matrices P; and P) and 
corresponding diagonal matrices Dı and D2 
such that Dı = P,'AP, and D> = P} ' AP}. 


. Let 
0 1 0 
A= 0 0 1 
—k 3 0 
a. Show the characteristic equation of A is 
3—3 +k=0. 
b. Sketch the graph of y(\) = X? — 3x + k for 
k < —2, k = 0, and k > 2. 


c. Determine the values of k for which the matrix 
A has three distinct real eigenvalues. 


. Suppose that B = P~'AP and v is an eigenvector 
of B corresponding to the eigenvalue X. Show 
that Pv is an eigenvector of A corresponding to 
the eigenvalue i. 


. Suppose that A is an n x n matrix such that every 
row of A has the same sum i. 


a. Show that ^ is an eigenvalue of A. 


b. Does the same result hold if the sum of every 
column of A is equal to X? 


. Let V be a vector space and T: V — V a linear 

operator. A subspace W of V is invariant under 

T if for each vector w in W, the vector T(w) is 

in W. 

a. Explain why V and {0} are invariant subspaces 
of every linear operator on the vector space. 


b. Show that if there is a one-dimensional 
subspace of V that is invariant under T, then T 
has a nonzero eigenvector. 


c. Let T be a linear operator on R? with matrix 
representation relative to the standard basis 


given by 
0 -l 
A= | ) a] 


10. 


5.4 Application: Markov Chains 317 


Show that the only invariant subspaces of T 
are R? and {0}. 


. a. Two linear operators S and T on a vector 


space V are said to commute if S(T(v)) = 
T(S(v)) for every vector v in V. If S and T are 
commuting linear operators on V and Xo is an 
eigenvalue of T, show that V,, is invariant 
under S, that is, $(Vi,) E Vo- 


b. Let S and T be commuting linear operators on 
an n-dimensional vector space V. Suppose that 
T has n distinct eigenvalues. Show that S and 
T have a common eigenvector. 


c. A pair of linear operators T and S on a vector 
space V is called simultaneously diagonalizable 
if there is an ordered basis B for V such that 
[T]g and [S]g are both diagonal. Show that if 
S and T are simultaneously diagonalizable 
linear operators on an n-dimensional vector 
space V, then S and T commute. 


d. Show directly that the matrices 


3 0 1 
A=|0 2 0 
1 0 3 
and 
1 0 -2 
B= 0 1 0 
—2 0 1 


are simultaneously diagonalizable. 


The Taylor series expansion (about x = 0) for the 
natural exponential function is 


xy 1 2 l 3 = S1 k 
e = ttz” tz” ea 


If A is ann x n matrix, we can define the matrix 
exponential as 


1 1 
A_ ay ebay eee 
e at +34 + 


. 1, 1l 3 bay 
= lim (J+A+—A*+—A°+---+—A") 
m—> oo PA! 3! m! 


318 Chapter 5 Eigenvalues and Eigenvectors 


a. Let D be the diagonal matrix b. Suppose A is diagonalizable and D = P~'AP. 
m1 0 0 ... O Show that e^ = Pe? P7!. 
0 OO ... 0 c. Use parts (a) and (b) to compute e^ for the 
P| tes ee as matrix 
eke Ace a ONG) — 
i [$4] 
and find e?. 
Chapter 5: Chapter Test 
In Exercises 1—40, determine whether the statement is 6. The eigenvectors of 
true or false. a-f- 4 0 
1. The matrix B 3 —5 


1 1 
P= 
Eos | we | |] ana | 3 |. 
diagonalizes the matrix 
7. The matrix 

pape k . a ey: 

a O 2 a cae 
f has an eigenvalue 4; = 1 and V}; has 
2. The matrix 


dimension 1. 
A= —1 1 
E 0 -2 8. If 
is similar to the matrix A= 3 $ 
-1 0 O3 ı 
D= 0 -1 2 2 
then AA! = I. 


3. The matrix 9. If A is a 2 x 2 matrix with det(A) < 0, then A 
has two real eigenvalues. 


—1 0 0 
A= 0 1 0 10. If A is a 2 x 2 matrix that has two distinct 
-1 -1 1 eigenvalues i; and ^2, then tr(A) = 1 + M2. 
is diagonalizable. 11. If A= | ; 4 | then the eigenvalues of A are 
4. The eigenvalues of 
1 0 hy =at+b and ^ =bd—a. 
A= i i l k 
=4 =3 12. For all integers k the matrix A = 11 has 
are 4; = —3 and ^2 = —1. only one eigenvalue. 
5. The characteristic polynomial of 13. If A is a 2 x 2 invertible matrix, then A and A~! 
= eee ai have the same eigenvalues. 
A=| 0 0 -I 14. If A is similar to B, then tr(A) = tr(B). 
2 —2 -1 


; T Liseyi ; 
is B+ 2242-4. 15. The matrix A = | 01 | is diagonalizable. 


at+c=b+d=hk 


then is an eigenvalue of A. 
In Exercises 17—19, let 


1 0 0 
A=|]0 2 0 
0 0 -l 
and 
—1 0 0 
B= 0 1 0 
0 0 2 


17. The matrices A and B have the same eigenvalues. 
18. The matrices A and B are similar. 
19. If 
0 1 0 
P={0 0 1 
1 0 0 
then B = P~'AP. 


20. If a 2 x 2 matrix has eigenvectors | = | and 


| = |. then it has the form 


2a-6B a-ß 
B—2a 2B-—a 
21. The only matrix similar to the identity matrix is 
the identity matrix. 


22. If } = 0 is an eigenvalue of A, then the matrix A 
is not invertible. 


23. If A is diagonalizable, then A is similar to a 
unique diagonal matrix. 


24. Ifan n x n matrix A has only m distinct 
eigenvalues with m < n, then A is not 
diagonalizable. 


25. Ifan n x n matrix A has n distinct eigenvalues, 
then A is diagonalizable. 


26. Ifan n x n matrix A has a set of eigenvectors that 
is a basis for R”, then A is diagonalizable. 


27. 


28. 


29. 


30. 
31. 


32. 


33. 


34. 


35. 


36. 


37. 


38. 


39. 


40. 


5.4 Application: Markov Chains 319 


If an n x n matrix A is diagonalizable, then A has 
n linearly independent eigenvectors. 


If A and B are n x n matrices, then AB and BA 
have the same eigenvalues. 


If D is a diagonal matrix and A = PDP, then 
A is diagonalizable. 


If A is invertible, then A is diagonalizable. 


If A and B are n x n invertible matrices, then 
AB™! and B™'A have the same eigenvalues. 


A 3 x 3 matrix of the form 


a 1 0 
0 a 1 
0 0 b 


always has fewer than three distinct eigenvalues. 


If A and B are n x n diagonalizable matrices with 
the same diagonalizing matrix, then AB = BA. 


If X is an eigenvalue of the n x n matrix A, then 
the set of all eigenvectors corresponding to ^ is a 
subspace of R”. 


If each column sum of an n x n matrix A is a 
constant c, then c is an eigenvalue of A. 


If A and B are similar, then they have the same 
characteristic equation. 


If à is an eigenvalue of A, then d* is an 
eigenvalue of A’. 


If A is a 2 x 2 matrix with characteristic 
polynomial X? + à — 6, then the eigenvalues of 
A? are X; = 4 and i) = 9. 


Define a linear operator T: Pı > P1, by 
T(a+bx) =a+(a+b)x. Then the matrix 
representation for A relative to the standard 


basis is 
1 0 
a= [tt 


and so T is not diagonalizable. 


If V = span{e*, e-*} and T: V > V is defined 
by T(f(x)) = f'(x), then T is diagonalizable. 


CHAPTER 


Inner Product Spaces 


6.1 The DotProducton R” 323 
6.2 Inner Product Spaces 333 

6.3 Orthonormal Bases 342 

6.4 Orthogonal Complements 355 


6.5 Application: Least Squares Approximation 366 
6.6 Diagonalization of Symmetric Matrices 377 

6.7 Application: Quadratic Forms 385 

6.8 Application: Singular Value Decomposition 392 


A ccording to a growing number of scientists, 
a contributing factor in the rise in global 
temperatures is the emission of greenhouse gases 
such as carbon dioxide. The primary source of 
carbon dioxide in the atmosphere is from the 
burning of fossil fuels. Table 1* gives the global 
carbon emissions, in billions of tons, from burn- 
ing fossil fuels during the period from 1950 
through 2000. A scatterplot of the data, shown 
in Fig. 1, exhibits an increasing trend which can 
be approximated with a straight line, also shown 
in Fig. 1, which best fits the data even though 


Table 1 
| Global Carbon Emissions 1950—2000 


Dis [ae |_| 


*Worldwatch Institute, Vital Signs 2006—2007. The trends that are shaping our future, W. W. Norton and 
Company, New York London, 2006. 


Billion Tons 


321 


322 


Chapter 6 Inner Product Spaces 


there is no one line that passes through all the points. To find this line, let (x;, y;), for 
i = 1,2,..., 11, denote the data points where x; is the year, starting with xı = 1950, 
and y; is the amount of greenhouse gas being released into the atmosphere. The linear 
equation y = mx + b will best fit these data if we can find values for m and b such 
that the sum of the square errors 


11 
XO [yi — (mx; + b)}? = [1.63 — (1950m — b)? +--+ + [6.64 — (2000m — by? 
i=l 


is minimized. One method for finding the numbers m and b uses results from multi- 
variable calculus. An alternative approach, using linear algebra, is derived from the 
ideas developed in this chapter. To use this approach, we attempt to look for numbers 
m and b such that the linear system 

m(1950)+b = 1.63 

m(1955)+b = 2.04 


m(2000)+b = 6.64 
is satisfied. In matrix form, this system is given by Ax = b, where 


1950 1 1.63 
1955 1 2.04 
1960 1 2.58 
1965 1 3.14 
1970 1 4.08 
A= | 1975 1 Sa and b=] 4.62 
1980 1 5.32 
1985 1 5.43 
1990 1 6.14 
1995 1 6.40 
2000 1 6.64 


Now, since there is no one line going through each of the data points, an exact solution 
to the previous linear system does not exist! However, as we will see, the best-fit line 
comes from finding a vector x so that Ax is as close as possible to b. In this case, 
the equation of the best-fit line, shown in Fig. 1, is given by 


y = 0.107x — 207.462 


In the last several chapters we have focused our attention on algebraic properties 
of abstract vector spaces derived from our knowledge of Euclidean space. For example, 
the observations made in Sec. 2.1 regarding the behavior of vectors in R” provided 
us with a model for the axiomatic development of general vector spaces given in 


6.1 > 


DEFINITION 1 


6.1 The Dot Product on R” 323 


Sec. 3.1. In this chapter we follow a similar approach as we describe the additional 
structures required to generalize the geometric notions of length, distance, and angle 
from R? and R? to abstract vector spaces. These geometric ideas are developed from 
a generalization of the dot product of two vectors in R”, called the inner product, 
which we define in Sec. 6.2. We begin with a description of the properties of the dot 
product on R” and its relation to the geometry in Euclidean space. 


The Dot Product on R” 


In Definition 2 of Sec. 1.3, we defined the dot product of two vectors 


uy v1 

u2 v2 
u= and v= 

Un Un 


in R” as 

U: V = UNV, H UV +- FH UnU 
To forge a connection between the dot product and the geometry of Euclidean space, 
recall that in R? the distance from a point (x1, x2, x3) to the origin is given by 


d= x? t x3 + x3 


Now let 


be a vector in R? in standard position. Using the distance formula, the length (or 
norm) of v, which we denote by || v||, is defined as the distance from the terminal 
point of v to the origin and is given by 


Iv || = 4/0? + v5 + v3 


Observe that the quantity under the square root symbol can be written as the dot 
product of v with itself. So the length of v can be written equivalently as 


Ivl = vy- v 


Generalizing this idea to R”, we have the following definition. 


Length of a Vectorin R” The length (or norm) of a vector 


324 Chapter 6 Inner Product Spaces 


in R”, denoted by || v ||, is defined as 


vl = opt uy t+ + 0 


a 


ay 1 


As an illustration, let v = 2 |. Then 
uF = 1 


< A : Ivii =v- v = 1/12 +22 +C) = V6 


In Sec. 2.1, it was shown that the difference u — v, of two vectors u and v in 
standard position, is a vector from the terminal point of v to the terminal point of u, 
as shown in Fig. 2. This provides the rationale for the following definition. 


Figure 2 


DEFINITION 2 Distance Between Vectors in R” Let 


uy v1 

uz v2 
u= and v= : 

Un Un 


be vectors in R”. The distance between u and v is defined by 


u —vi| = /(u—v)-(u—v) 


Since the orientation of a vector does not affect its length, the distance from u to 
v is equal to the distance from v to u, so that 


lu- vi =v- ul 


| EXAMPLE1 | Show that if v is a vector in R” and c is a real number, then 


levi =] c ilv] 
Solution Let 
vı 
v2 
v= 


Un 


PROPOSITION 1 


Solution 


6.1 The Dot Product on R” 325 


Then 
ll ev || = y (cv) - (cv) = y (cvi)? + (m2)? + -- - + (cvn)? 
= cu? + +e) = (or 
= |e / v? +v +--+ =c] ivi 


The result of Example 1 provides verification of the remarks following Definition 
2 of Sec. 2.1 on the effect of multiplying a vector v by a real number c. Indeed, 
as a consequence of Example 1, if |c| > 1, then cv is a stretching or dilation of 
v; and is a shrinking or contraction of v if |c| < 1. If, in addition, c < 0, then the 
direction of cv is reversed. As an illustration, let v be a vector in R” with || v || = 10. 
Then 2v has length 20. The vector —3v has length 30 and points in the opposite 
direction of v. 
If the length of a vector in R” is 1, then v is called a unit vector. 


Let v be a nonzero vector in R”. Then 


1 
uy = ——v 
lv Il 
is a unit vector in the direction of v. 


Proof Using Definition 1 and the result of Example 1, we have 
1 1 


oe a LS Iyi 
ivil ivil 


Ivi = 


IIv Il 


Since 1/ || v || > 0, then the vector uy has the same direction as v. 


Let 
1 


M= 2 
—2 
Find the unit vector uy in the direction of v. 


Observe that || v || = y 1? + 22 + (—2)? = 3. Then by Proposition 1, we have 


Uy = -V= >- 2 
—2 


326 Chapter 6 Inner Product Spaces 


THEOREM 1 


Theorem | gives useful properties of the dot product. The proofs are straightfor- 


ward and are left to the reader. 


Let u, v, and w be vectors in R” and c a scalar. 


1. 


wR YN 


u-u>0 

u-u = 0 if and only if u = 0 

u-v=v-u 

u-(v+w) =u-v+u-w and (u+v)-w=u-w+v-w 


. (cu): v =c(u-v) 


| EXAMPLE3 | Let u and v be vectors in R”. Use Theorem 1 to expand (u + v) - (u + v). 


Solution 


THEOREM 2 


By repeated use of part 4, we have 


(u +v). (u +v) = (u + v)-u + (u +v). v 
=u-u+v-u+uU:V+V:-vV 


Now, by part 3, v -u = u - v, so that 


(u+v):(u+v)=u-u+2u-v+v-v 


or equivalently, 


(u+v)+(u+v) = lul? + 2u-v + |lvl? 


The next result, known as the Cauchy-Schwartz inequality, is fundamental in devel- 


oping a geometry on R”. In particular, this inequality makes it possible to define the angle 
between vectors. 


Cauchy-Schwartz Inequality If u and v are vectors in R”, then 


lu- v| < [lull llv 


Proof If u=0, then u-v =0. We also know, in this case, that ||u || || v || = 


0 || v || = 0 so that equality holds. Now suppose that u ¥ 0 and k is a real number. 
Consider the dot product of the vector ku + v with itself. By Theorem 1, part 1, 
we have 


(ku + v):(ku+v) = 0 


Now, by Theorem 1, part 4, the left-hand side can be expanded to obtain 


k (u-u) + 2k(u-v) +v-v>0 


Observe that the expression on the left-hand side is quadratic in the variable k 
with real coefficients. Letting a = u-u, b=u-v, and c = v» v, we rewrite this 
inequality as 


ak? + 2bk +c>0 


ay 


ilu — vl 


Figure 3 


6.1 The Dot Product on R” 327 


This inequality imposes conditions on the coefficients a,b, and c. Specifically, 
the equation ak? + 2bk +c = 0 must have at most one real zero. Thus, by the 
quadratic formula, the discriminant (2b)* — 4ac < 0, or equivalently, 


(u-v)? < (u-u(v-v) 
After taking the square root of both sides, we obtain 
luv] <|lulliivil 


as desired. 


The Angle between Two Vectors 


With the Cauchy-Schwartz inequality in hand, we are now in a position to define the 
angle between two vectors. To motivate this idea, let u and v be nonzero vectors in 
R? with u — v the vector connecting the terminal point of v to the terminal point of 
u, as shown in Fig. 3. As these three vectors form a triangle in R?, we apply the law 
of cosines to obtain 


2 2 2 
lu = vt = lult + Iv IE = 2u | lv I] cos 8 


Using Theorem 1, we rewrite this equation as 


u-u—2u:v+v:v=u-u+v:v-—2||ul||||vI|cos® 
After simplifying and solving for cos 8, we obtain 


u.v 
cos ð = ————_ 
| ull || v ll 


Our aim now is to extend this result and use it as the definition of the cosine 
of the angle between vectors in n-dimensional Euclidean space. To do so, we need 
| cos | < 1 for every angle 9, that is, 

eo se 
~ av ~ 


for all vectors u and v in R”. But this fact follows immediately from the Cauchy- 
Schwartz inequality. Indeed, dividing both sides of 


lu- v| < Ilu |l liv |l 


by || u || || v ||, we obtain 


ju» v] 
hull iiv|l ~ 
so that 
u.v 


< —— < 
lu Iiv l 


This permits us to make the following definition. 


328 Chapter 6 Inner Product Spaces 


DEFINITION 3 Angle Between Vectors in R” If u and v are vectors in R”, then the cosine 
of the angle 0 between the vectors is defined by 


u.v 
cos 9 = ————. 
huliiiv il 
| EXAMPLE4 | Find the angle between the two vectors 
2 -1 
te || =2 and y= 2 
3 2 


Solution The lengths of the vectors are 


lul = V2 +€) F32 = v17 ad Dii aoe 


and the dot product of the vectors is 
u: v = 2(—1) + (—2)2+ 3(2) = 0 


By Definition 3, the cosine of the angle between u and v is given by 
u.v 
cos § = ———— = 
ull ivi 
Hence, 0 = 1/2 and the vectors are perpendicular. Such vectors are also called 
orthogonal. 


DEFINITION 4 Orthogonal Vectors The vectors u and v are called orthogonal if the angle 
between them is 1/2. 


As a direct consequence of Definition 3, we see that if u and v are nonzero vectors 
in R” with u-v = 0, then cos 6 = 0, so that O = 1/2. On the other hand, if u and v 
are orthogonal, then cos 0 = 0, so that 
u.v 

—— = 0) therefore u-v=0 

lull iiv ll 
The zero vector is a special case and is considered to be orthogonal to every vector in 
R” since 0-v = 0, for every vector v. These results are given in Proposition 2. 


6.1 The Dot Product on R” 329 


PROPOSITION 2 Two nonzero vectors u and v in R” are orthogonal if and only if u-v = 0. The 
zero vector is orthogonal to every vector in R”. 


One consequence of Proposition 2 is that if u and v are orthogonal, then 


Ju+v|? =(ut+v):(u+yv) = llul? + 2u-v + Iiv]? 
= |lu |? + Iv? 


This is a generalization of the Pythagorean theorem to R”. 
Theorem 3 gives several useful properties of the norm in R”. 


THEOREM 3 Properties of the Normin R” Let v be a vector in R” and c a scalar. 


1. ||v||>0 
2. || v|| = 0 if and only if v= 0 
3. |lev ll = [c] ll v |l 


4. (Triangle inequality) |u+vl| < lull + ivi 


Proof Parts 1 and 2 follow immediately from Definition 1 and Theorem 1. Part 3 
is established in Example 1. To establish part 4, we have 


Ju+v|l? =(u+v)-(u+v) 
= (u-u) + 2(u-v) + (v-v) 
= |u|? +2- v) + [vl 
< |u? +2lu- v] + |v il? 
Ay Now, by the Cauchy-Schwartz inequality, |u -v| < || u || || v|], so that 
kai ut vie <ul? +2 Hull ivi + vie 
= (Jull + Iv i)? 


After taking square roots of both sides of this equation, we obtain 


Jut+vi|s llall+iivil 


Figure 4 Geometrically, part 4 of Theorem 3 confirms our intuition that the shortest distance 
between two points is a straight line, as seen in Fig. 4. 


330 


Chapter 6 Inner Product Spaces 


PROPOSITION 3 


Let u and v be vectors in R”. Then ||u+v]| = |lu || + || v|| if and only if the 
vectors have the same direction. 


Proof First suppose that the vectors have the same direction. Then the angle 
between the vectors is 0, so that cos @ = 1 and u-v = || u || || v ||. Therefore, 


lu +v? = u +v): +v) 
= |lu? +2- v) + Ivl? 
= |lu? + 2u v+ v? 
= (llu i+ Iv? 


Taking square roots of both sides of the previous equation gives ||u + v || = 
ull + Iv I. 
Conversely, suppose that || u + v || = ||u|| + || v ||. After squaring both sides, 
we obtain 
la+ vi? = ull? +2|uliiivi + ivi? 


However, we also have 
lu +v]? = (+v). +v) = |u? + 2u-v+ |v? 


Equating both expressions for || u + v ||? gives 


ull? +2iu ivi + ivi? = lull? + 2u-v + ivi? 

Simplifying the last equation, we obtain u-v = || u || || v || and hence 
uev 
laivi 


Therefore, cos = 1, so that 0 = 0 and the vectors have the same direction. 


Fact Summary 


All vectors are in R”. 


1. The length of a vector and the distance between two vectors are natural 
extensions of the same geometric notions in R? and R3. 

2. The dot product of a vector with itself gives the square of its length and is 
0 only when the vector is the zero vector. The dot product of two vectors is 
commutative and distributes through vector addition. 

3. By using the Cauchy-Schwartz inequality |u -v| < || u || || v ||, the angle 
between vectors is defined by 

u-v 
cos § = ———_—. 
| wlll vl 


4. Two vectors are orthogonal if and only if the dot product of the vectors is 0. 


6.1 The Dot Product on R” 331 


5. The norm of a vector is nonnegative, is 0 only when the vector is the zero 
vector, and satisfies 


I| cu || = |e] ull and Jut+vi| < llul + vl 


Equality holds in the last inequality only when the vectors are in the same 
direction. 


6. If u and v are orthogonal vectors, then the Pythagorean theorem 


2 2 2 
lu +v = lull + llv 


holds. 


Exercise Set 6.1 O İC 


= =I 


In Exercises 1—4, let In Exercises 11—16, let 


0 -3 =i 
asy! y= j=l u=| -2 v=| -1 
3 2 3 =3 
1 11. Find || u ||. 
w= 1 
=3 12. Find the distance between u and v. 
Compute the quantity. 13. Find a unit vector in the direction of u. 
1. u-v 14. Find the cosine of the angle between the two 
2. ibA vectors. Are the vectors orthogonal? Explain. 
vev 
3. u. (v + 2w) 15. Find a vector in the opposite direction of v with 
length 3. 
q iY 
wew 16. Find a nonzero vector w that is orthogonal to both 
In Exercises 5—10, let u and v. 
1 2 g 
u= 5 v= 1 17. Find a scalar c, so that | 3 | is orthogonal to 
5. Find ||u ||. k | 
> |: 
6. Find the distance between u and v. i 
7. Find a unit vector in the direction of u. 18. Find a scalar c, so that c | is orthogonal to 
8. Find the cosine of the angle between the two 0 
vectors. Are the vectors orthogonal? Explain. 2 
9. Find a vector in the direction of v with length 10. -1 


10. Find a nonzero vector w that is orthogonal to both 
u and v. 


332 Chapter 6 Inner Product Spaces 


In Exercises 19—22, let 


1 6 
vV = w= —2 
=l 2 
=] -1/¥3 
v3 = =2 V4 = 1/3 
1 1/V3 
V5 = = 


19. Determine which pairs of the vectors are 
orthogonal. 


20. Determine which pairs of the vectors are in the 
same direction. 


21. Determine which pairs of the vectors are in the 
opposite direction. 


22. Determine which of the vectors are unit vectors. 


In Exercises 23—28, find the projection of u onto v 
given by 


The vector w is called the orthogonal projection of u 
onto v. Sketch the three vectors u, v, and w. 


27. u = 


Wn 
< 
ll 
=N COCOrF 


| 
| 
| 


28. 


30. 


31. 


32. 


33. 


34. 


35. 


36. 


37. 


. Let S = {uy, u2, .. 


2 0 
u= 3 |v=] 2 
—1 3 


., Un} and suppose v- u; = 0 
for each i = 1,..., n. Show that v is orthogonal 
to every vector in span(S). 


Let v be a fixed vector in R” and define 
S = {u | u-v = 0}. Show that S is a subspace of 
R”. 


Let S = {v1, V2,..., Vn} be a set of nonzero 
vectors which are pairwise orthogonal. That is, if 
i Æ j, then v; +v; = 0. Show that S is linearly 
independent. 


Let A be an n x n invertible matrix. Show that if 
i Æ j, then row vector i of A and column vector 
j of A`! are orthogonal. 


Show that for all vectors u and v in R”, 


lu +v? +u- v|? 
=2]u j? +21? 


a. Find a vector that is orthogonal to every vector 
in the plane P: x +2y—z=0. 

b. Find a matrix A such that the null space N (A) 
is the plane x +2y =z = 0. 


Suppose that the column vectors of an n x n 
matrix A are pairwise orthogonal. Find A'A. 


Let A be ann x n matrix and u and v vectors in 
R”. Show that 


u- (Av) = (A'u) -v 


Let A be an n x n matrix. Show that A is 
symmetric if and only if 
(Au) -v =u: (Av) 


for all u and v in R”. Hint: See Exercise 36. 


DEFINITION 1 


Solution 


6.2 Inner Product Spaces 333 


6.2 > Inner Product Spaces 


In Sec. 6.1 we introduced the concepts of the length of a vector and the angle between 
vectors in Euclidean space. Both of these notions are defined in terms of the dot 
product and provide a geometry on R”. Notice that the dot product on R” defines a 
function from R” x R” into R. That is, the dot product operates on two vectors in 
R”, producing a real number. To extend these ideas to an abstract vector space V, 
we require a function from V x V into R that generalizes the properties of the dot 
product given in Theorem 1 of Sec. 6.1. 


Inner Product Let V be a vector space over R. An inner product on V is 
a function that associates with each pair of vectors u and v in V a real number, 
denoted by (u, v), that satisfies the following axioms: 


The last two properties make the inner product linear in the first variable. Using 
the symmetry axiom, it can also be shown that the inner product is linear in the 
second variable, that is, 
3. (u, v + w) = (u, v) + (u, w) 
4. (u, cv) = c (u, v) 
With these additional properties, the inner product is said to be bilinear. 

A vector space V with an inner product is called an inner product space. 


By Theorem 1 of Sec. 6.1, the dot product is an inner product on Euclidean 
n-space. Thus, R” with the dot product is an inner product space. 


reva 


product is the dot product. 


| Find all vectors u in R? such that (u, v) = 0, where the inner 


If u = | x |, then 
y 


(u,v) =u-v=x+3y 


334 Chapter 6 Inner Product Spaces 


AY 


Figure 1 


Solution 


so that (u, v) = 0 if and only if y = —4x. Therefore, the set of all vectors such 


1 
that (u, v) = 0 is given by S = span { | 1 | \ The vector v and the set S are 
=a 


shown in Fig. 1. Notice that each vector in S is perpendicular to v. 


For another example, consider the vector space of polynomials P2. To define an 


inner product on P3, let p(x) = ao + aix + dx” and q(x) = bo + bıx + box”. Now 
let (+, +): Po x P2 — R be the function defined by 


(p,q) = aobo + ab, + a2b2 


Notice that this function is similar to the dot product on R*. The proof that P, is 
an inner product space follows along the same lines as the proof of Theorem 1 of 
Sec. 6.1. 

Another way to define an inner product on Pz is to use the definite integral. 
Specifically, let p(x) and q(x) be polynomials in P2, and let (-, -) be the function 
defined by 


1 
(p.q) =| P(x)q(x) dx 


This function is also an inner product on P2. The justification, in this case, is based 
on the fundamental properties of the Riemann integral which can be found in any text 
on real analysis. 


Let V = P, with inner product defined by 


1 
(p.q) = f p(x)q(x)dx 


a. Let p(x) = 1 — x? and q(x) = 1 — x + 2x”. Find (p, q). 
b. Let p(x) = 1 — x?. Verify that (p, p) > 0. 


a. Using the definition given for the inner product, we have 
1 
(p.q) = | (1 —x”)(1 — x + 2x*) dx 
0 
1 
` a —2x*) dx 
0 


1 1 1 2 
= ( = al ae ral ar 7 a =") 
Va 


~ 60 


0 


Solution 


6.2 Inner Product Spaces 335 


b. The inner product of p with itself is given by 
1 
(p, p) =| (1 x°)(1 —x*) dx 
0 
1 
=| (1—2x? + x) dx 
0 


2 1 ' 
= ( = a ar 25) 


8 
—>0 
15 


0 


Example 3 gives an illustration of an inner product on R” that is not the dot 
product. 


Let V = R? and 
rel and e 
2 


be vectors in V. Let k be a fixed positive real number, and define the function 
(-, +): R? x R? > R by 
(u, v) = uv + kuzv2 
Show that V is an inner product space. 
First we show that (-, -) is nonnegative. From the definition above, we have 
(u, u) = uy + kus 
Since k > 0, then u? + ku3 > 0 for every vector u. In addition, 
ui+kui=0 if andonlyif wj=u2.=0 
or equivalently if u = 0. The property of symmetry also holds since 


(u, v) = uiv + kugu2 = viu; + kvu = (y, u) 
w : 
Next, let w = | s | be another vector in R?. Then 
2 


(u + v, w) = (4 + v1) wi + k(u2 + v2)w2 
= (uiw + kuzw2) + (vj wy + kv2w2) 
= (u, w) + (v, w) 


336 


Chapter 6 Inner Product Spaces 


THEOREM 4 


Finally, if c is a scalar, then 
(cu, v) = (cu1)vı + k(cu2)v2 = c(uj v1 + kuzv2) = c (u, v) 


Therefore, R? with this definition for (-, -) is an inner product space. 


Notice that in Example 3 the requirement that k > 0 is necessary. For example, 


itk=-1andu= |} 


(1)? + (—1)(2)? = —3, which violates the first axiom of Definition 1. 
Again using R” as our model, we now define the length (or norm) of a vector v 
in an inner product space V as 


| , then the inner product of u with itself is given by (u, u) = 


Il v Il = v (v, v) 
The distance between two vectors u and v in V is then defined by 
| u—v|| = y(u- v, u — v) 


The norm in an inner product space satisfies the same properties as the norm in 
R”. The results are summarized in Theorem 4. 


Properties of the Norm in an Inner Product Space Let u and v be vectors 
in an inner product space V and c a scalar. 


1. || v|| 20 

2. || v|| = 0 if and only if v= 0 

3. Ilev || = Iel ll vl 

4. | (u,v) | < || ull || v|| (Cauchy-Schwartz inequality) 
5. || u+ vl < Ilu || + || vl] (Triangle inequality) 


Let V = R? with inner product defined by 


(u, v) = uy Vv, + 3u2Vv2 


[43] om [i 


a. Verify that the Cauchy-Schwartz inequality is upheld. 


Let 


b. Verify that the Triangle Inequality is upheld. 


Solution 


6.2 Inner Product Spaces 


a. Using the given definition for the inner product, we have 


Iu, v)| = |(2)(1) + 3(—2)(4)| = | — 22| = 22 
The norms of u and v are given, respectively, by 


lul = vlu, u) = V2} +32)? = V16 = 4 
lvl = viv, v) = JC)? + 3(4)? = v49 =7 


22 = |(u, v)| < || ull || v I| = 28 
the Cauchy-Schwartz inequality is satisfied for the vectors u and v. 


and 


Since 


. To verify the triangle inequality, observe that 


Aye ae 
meee | | 
lu +yli = /G)2 +32) = v21 


v21 = |/u+ yl < [lw ly ee 
the triangle inequality holds for u and v. 


so that 


Since 


Orthogonal Sets 


(u, v) 
cos 9 = ———— 
lua liiv] 


As before, the vectors u and v in V are orthogonal provided that (u, v) = 0. 


| EXAMPLES | Let V = P2 with inner product defined by 


1 
(p.q) = e p(x)q(x) dx 


a. Show that the vectors in 


S= {1, x, ¿68x? — 1)} 


are mutually orthogonal. 


b. Find the length of each vector in S. 


337 


Taking the same approach as in Sec. 6.1, we define the cosine of the angle between 
the vectors u and v in an inner product space V by 


338 Chapter 6 Inner Product Spaces 


Solution 


DEFINITION 2 


PROPOSITION 4 


a. The inner product of each pair of vectors in S is 


1 
(2) 1,2/' =0 


1 | a ay =| tee-na = 73 ) 
D mee ne all 


il. lil 3 WS a ll » 
(x. 50x -D)= f 50x —x)dx = TE =z 


Since each pair of distinct vectors is orthogonal, the vectors in § are mutually 
orthogonal. 


i 


1 
=0 
-1 


b. For the lengths of the vectors in S, we have 


il 
= [ara 


1 
Isl=Veal=y f stax = ff 


|| 53x? — D| = y ($x? — 1), 46x? — 1) = 3 f Gx? — 1)dx = v 


Orthogonal Set A set of vectors {v1, V2, ..., Vn} in an inner product space is 
called orthogonal if the vectors are mutually orthogonal; that is, if i ~ j, then 
(vi, vj) = 0. If in addition, ||v; || = 1, for all i = 1,...n, then the set of vectors 


is called orthonormal. 


Observe that the vectors of Example 5 form an orthogonal set. They do not, how- 
ever, form an orthonormal set. Proposition 4 shows that the zero vector is orthogonal 
to every vector in an inner product space. 


Let V be an inner product space. Then (v, 0) = 0 for every vector v in V. 
Proof Let v be a vector in V. Then 
(v, 0) = (v, 0 + 0) = (v, 0) + (v, 0) 


After subtracting (v, 0) from both sides of the previous equation, we have (v, 0) = 0 
as desired. 


A useful property of orthogonal sets of nonzero vectors is that they are linearly 
independent. For example, the set of standard basis vectors {e}, e2, €3} in R? is orthogonal 


THEOREM 5 


COROLLARY 1 


THEOREM 6 


6.2 Inner Product Spaces 339 


and linearly independent. Theorem 5 relates the notions of orthogonality and linear 
independence in an inner product space. 


If S = {vj, V2, ..., Vn} is an orthogonal set of nonzero vectors in an inner product 
space V, then S is linearly independent. 


Proof Since the set S is an orthogonal set of nonzero vectors, 
(vi, vj) =0 fori Æ j and (vi, vi) = || v; |? £0 for all i 
Now suppose that 
Civ) + C2V2 + +++ +CnVn = 0 
The vectors are linearly independent if and only if the only solution to the previous 


equation is the trivial solution cı = c2 = --- = c, = 0. Now let v; be an element 
of S. Take the inner product on both sides of the previous equation with v; so that 


(vj, (civi +coV2 + =- + cCj-1Vj-1 + cjVj +cjqiVjgi tee + CnVn)) = (vj, 0) 


By the linearity of the inner product and the fact that S is orthogonal, this equation 
reduces to 


cj (Vj, Vj) = (vj, 0) 
Now, by Proposition 4 and the fact that | vj I| # 0, we have 


Cj Iv; |? =0 so that cj =0 


Since this holds for each j = 1,...,, then c1 = cz = --- = Cn = 0 and therefore 
S is linearly independent. 


If V is an inner product space of dimension n, then any orthogonal set of n nonzero 
vectors is a basis for V. 


The proof of this corollary is a direct result of Theorem 12 of Sec. 3.3. Theorem 
6 provides us with an easy way to find the coordinates of a vector relative to 
an orthonormal basis. This property underscores the usefulness and desirability of 
orthonormal bases. 


If B = {v1, V2, ..., Vn} is an ordered orthonormal basis for an inner product space 
V and v= c1V1 + €2V2 + --- + CnVn, then the coordinates of v relative to B are 
given by c; = (v;, v) for each i = 1,2,...,n. 


Proof Let v; be a vector in B. Taking the inner product on both sides of 


v= CIV] + C2V2 + ee P Cj-1Vi-1 + CVE HF eNit F e 1 CnVn 


340 Chapter 6 Inner Product Spaces 


with v; on the right gives 
(V, Vi) = ((C1V1 + C2V2 + +++ + Ci—1Vi—1 + CiVi + Ci41Vi41 + +e + CnVn), Vi) 
= c1 (V1, Vi) + +++ + Ci (Vi, Vi) + + + Cn (Vn, Vi) 
Since B is an orthonormal set, this reduces to 
(V, Vi) = ci (Vi, Vi) = Ci 


As this argument can be carried out for any vector in B, then c; = (v, v;) for all 
i=1,2,...,n. 


In Theorem 6, if the ordered basis B is orthogonal and v is any vector in V, then 
the coordinates relative to B are given by 


ra a for eachi = 1,...,n 
(Vi, Vi) 
so that 
E ap VINE! g a p e 
~ wey) Wav) (Vn Vn) ” 


Fact Summary 


All vectors are in an inner product space. 


1. An inner product on a vector space is a function that assigns to each pair of 
vectors a real number and generalizes the properties of the dot product on 
R”. 

. The norm of a vector is defined analogously to the definition in R” by 
Il v ll = vív, v). 

. An orthogonal set of nonzero vectors is linearly independent. Thus, any 
set of n nonzero orthogonal vectors is a basis for an inner product space of 
dimension n. 

. When an arbitrary vector is written in terms of the vectors in an orthogonal 
basis, the coefficients are given explicitly by an expression in terms of the 
inner product. If {v;,...,v,} is the orthogonal basis and v is an arbitrary 
vector, then 


(wana 7 eva ree (V, Vn) 
Wev) Wavy) (Vn: Vn) 


n 


If B is an orthonormal basis, then 


v = (v, v1) vi + +++ + (V, Vn) Vn 


Exercise Set 6.2 


In Exercises 1—10, determine whether V is an inner 
product space. 


1.V=R 
(u, v) = uiv — 2uyv2 — 2uz2v1 + 3u2Vv2 
2,.V=R 
(u, v) = —u1v; + 2u; v2 
3. V = R? 
(u, v) = uzuy +0303 
4. V=R 
(u, v) = uy vy + 2u2v2 + 3303 
5. V= R” 
(u, v) =u-v 
6. V = Mmxn 
(A, B) = tr(B' A) 
7. V= Mme 
(A, B) = ee ij 
i=l j=l 
8. V =P, 
(p.q) = Loa 
9% V= comi, i 
(S, 8) = fli F@)g ae dx 
10. V = C®[-1, 1] 


g)= Ji f(x)g(x)x dx 


In Exercises 11—14, let V = C[a, b] with inner 
product 


b 
(f, 8) = i: f (x)g(x) dx 


Verify that the set of vectors is orthogonal. 


. {1, cosx, sinx}; a = —n,b = x 

. {1, x, 4(5x° — 3x)};a = —1,b= 1 

. {1,2x —1,—x? +x —¿};a=0,b=1 

. {1, cosx, sin x, cos 2x, sin 2x}; a = —n,b = n 


In Exercises 15—18, let V = C[a, b] with inner 
product 


6.2 Inner Product Spaces 341 


b 
oe f Flx)e(x) dx 


a. Find the distance between the vectors f and g. 


b. Find the cosine of the angle between the 
vectors f and g. 


15. f(x) = 3x -2, g(x) =x? +1;a=0,b=1 
16. f(x) = cosx, g(x) = sinx; a = —-m,b = 7 
17. f(x) =x, g(x)=e*;a=0,b=1 

18. f(x) =e", ga) =e*;a=-l1,b=1 


In Exercises 19 and 20, let V = P2 with inner product 


(p.q) = Yn 


a. Find the distance ees the vectors p and q. 


b. Find the cosine of the angle between the 
vectors p and q. 


19. p(x) =x +x — 2, q(x) = —x? +x +2 
p(x) =x —3,q(x)=2x— 6 
In Exercises 21—24, let V = Mnxn with inner product 


(A, B) = tr(B' A) 
a. Find the distance between the vectors A and B. 


b. Find the cosine of the angle between the 
vectors A and B. 


1 2 PA | 
aa ae sae | 
3 1 0 2 
aaah a a] 
1 0 -2 
23. A=| -3 1 1 
EE 
3 -1 -1 
B=] 3 2 
zf ea 


342 


24, 


25. 


26. 


27. 


28. 


29. 


Chapter 6 Inner Product Spaces 


2 1 2 
A=}3 1 0 
3 2 1 
0 0 1 
B=|3 3 2 
1 0 2 
Describe the set of all vectors in R? that are 


orthogonal to | : | 


Describe the set of all vectors in R? that are 


orthogonal to | 4 f 


Describe the set of all vectors in R? that are 
2 

—3 
1 


orthogonal to 


Describe the set of all vectors in R? that are 


1 
orthogonal to | 1 
0 


For f and g in C(0, 1] define the inner product 


b 
4 1 


e= | Fd 

. Find (x°, x°). 

. Find (e*,e~*). 

. Find || 1 || and || x ||. 

. Find the angle between f(x) = 1 and 
g(x) =x. 

e. Find the distance between f(x) = 1 and 
g(x) =x. 


an Sh 


30. 


31. 


32. 


33. 


34. 


6.3 > Orthonormal Bases 


In Theorem 6 of Sec. 6.2 we saw that if B = {vj, v2,.. 


Let A be a fixed 2 x 2 matrix, and define a 
function on R? x R? by 


(u, v) = u Av 


a. Verify that if A = Z, then the function defines 
an inner product. 
2 -1 


b. Show that if A = | 4 2 |. then the 


function defines an inner product. 
3-2 


2 0 
does not define an inner product. 


c. Show that if A = , then the function 


Define an inner product on C®[—a, a] by 


(f, 8) = f (x)g(x) dx 


Show that if f is an even function and g is an 
odd function, then f and g are orthogonal. 


Define an inner product on C [—n, x] by 


(f.8) = ; f(x) g(x) dx 
Show 
{1, cos x, sin x, cos 2x, sin2x,...} 
is an orthogonal set. (See Exercise 31.) 


In an inner product space, show that if the set 
{u;, u2} is orthogonal, then for scalars cı and c2 
the set {cjU;, C2Uz} is also orthogonal. 


Show that if (u, v) and ((u, v)) are two different 
inner products on V, then their sum 


(((u, v))) = (u, v) + ((u, v)) 


defines another inner product. 


., Vn} is an ordered orthonormal 


basis of an inner product space V, then the coordinates of any vector v in V are 
given by an explicit formula using the inner product on the space. In particular, these 
coordinates relative to B are given by c; = (v, v;) for i = 1, 2,...n. For this reason, 
an orthonormal basis for an inner product space is desirable. As we have already seen, 


vail u — proj, u 


proj, u 


Figure 2 


M 


6.3 Orthonormal Bases 343 


the set of coordinate vectors S = {e1, €2, ..., €n} is an orthonormal basis for R”. In 
this section we develop a method for constructing an orthonormal basis for any finite 
dimensional inner product space. 


Orthogonal Projections 


Of course, most of the bases we encounter are not orthonormal, or even orthogonal. 
We can, however, in a finite dimensional inner product space, transform any basis to an 
orthonormal basis. The method, called the Gram-Schmidt process, involves projections 
of vectors onto other vectors. 

To motivate this topic, let u and v be nonzero vectors in R?, as shown in Fig. 1(a). 


u uy 
AO, 6 
v w v 
(a) (b) 
Figure 1 


Our aim is to find a vector w that results from an orthogonal projection of u onto v, 
as shown in Fig. 1(b). To do this, recall from trigonometry that 
w 
ose = l so that || w || = || u || cos 8 
[u {I 
Moreover, using the expression for cos 0, established at the beginning of Sec. 6.1, we 
have 


u-v u-v 
I| w || = | ull cos 9 = || ull 


uliiivil Ivi 
This quantity is called the scalar projection of u onto v. Now, to find w, we take 
the product of the scalar projection with a unit vector in the direction of v, so that 


(=) v u-v 
w= — —— = —_ V 
Ivi vi liv? 


Moreover, since || v ||? = v- v, the vector w can be written in the form 


u.v 
w= (=y 
vev 
This vector is called the orthogonal projection of u onto v and is denoted by proj, u, 


so that 
i u-v 
proj, u = (=) v 
v-v 
Another useful vector, shown in Fig. 2, is the vector 


u — proj, u 


344 Chapter 6 Inner Product Spaces 


from proj, u to u. From the manner in which proj, u is defined, the vector u — proj, u 
is orthogonal to proj, u, as shown in Fig. 2. To verify algebraically that proj, u and 
u — proj, u are orthogonal, we show that the dot product of these two vectors is zero. 
That is, 


(proj, u) : (u — proj, u) = (=v) : (u — y) 


u.v u.v 
= — (v-u—u-v) = —  (u-v—u-v)=0 
vev vev 


u In Fig. 1, the angle 0 shown is an acute angle. If O is an obtuse angle, then proj, u 
Se : gives the orthogonal projection of u onto the negative of v, as shown in Fig. 3. If 
w v 0= 90°, then proj, u = 0. 

Figure 3 


a. Find proj, u. 
b. Find u — proj, u, and verify that proj, u is orthogonal to u — proj, u. 


Solution a. From the formula given above, we have 
oiu- (17) v- (Oroa 
Proa = Fv l oOo Oo |i © 
b. Using the result of part (a), we have 


f 1 2 -1 
u—projyu=| 3 |=|3 |5 1 
To show that proj, u is orthogonal to u — projyu, we compute the dot product. 
Here we have 
: : 2 -1 
proj, u- (u — projy u) = | 7 | : | | = @(-1) + @d =0 


See Fig. 4. 


Figure 4 Definition 1 provides an extension of this idea to general inner product spaces. 


DEFINITION 1 


Solution 


6.3 Orthonormal Bases 345 


Orthogonal Projection Let u and v be vectors in an inner product space. The 
orthogonal projection of u onto v, denoted by proj, u, is defined by 


(u, v) 
(v, vy). 


proj, u = 


The vector u — proj, u is orthogonal to proj, u. 


Define an inner product on P3 by 


1 
(p.q) = | p(x)q(x) dx 


Let p(x) = x and q(x) = x?. 


a. 
b. 


a. 


Find proj, p. 
Find p — proj, p and verify that proj, p and p — proj, p are orthogonal. 


In this case 


hs 1 i 1 
(pP.q) = ie ke = = and Cha = x dx = > 
0 4 0 5 


Now the projection of p onto q is given by 


. From part (a), we have 


Sey 
p — proja p=xX— FX 


To show that the vectors p and p — proj, p are orthogonal, we show that the 
inner product is zero. Here we have 


15 5 S 25 5 5 
| =x? | E oas if Zx - 34 | dS | 
o 4 4 o \4 16 16 16 


We now turn our attention to the construction of an orthonormal basis for an 


1 
=0 
0 


inner product space. The key to this construction is the projection of one vector onto 
another. As a preliminary step, let V = R? and let B = {v1, v2} be a basis, as shown 
in Fig. 5. Now, define the vectors w; and w2 by 


w =V] and W2 =V2 — proj,, V2 


346 


Chapter 6 Inner Product Spaces 


THEOREM 7 


Recall from Example 1 that w2 defined in this way is orthogonal to w; as shown in 
Fig. 6. We normalize these vectors by dividing each by its length, so that 


B= { Wi w2 \ 
Il wi Il’ {I wo ll 


is an orthonormal basis for R?. 


A A 
v2 
Vi 
> 
projy, V2 
B = {vj, Vo} is a basis for R? W2 = V2 — proj,, V2 
Figure 5 Figure 6 


To construct an orthonormal basis for R”, we first need to extend this idea to 
general inner product spaces. 


Every finite dimensional inner product space has an orthogonal basis. 


Proof The proof is by induction on the dimension n of the inner product space. 
First if n = 1, then any basis {vı} is orthogonal. Now assume that every inner 
product space of dimension n has an orthogonal basis. Let V be an inner product 
space with dim(V) = n + 1, and suppose that {v1, V2, ..., Vn, Vn+i} is a basis. Let 
W = span{v), V2, ..., Vn}. Observe that dim(W) = n. By the inductive hypothe- 
sis, W has an orthogonal basis B. Let B = {w1, W2,..., Wn}. Notice that B’ = 
{W1, W2,---, Wn, Vn+1} is another basis for V. By Theorem 5 of Sec. 6.2, it suf- 
fices to find a nonzero vector w that is orthogonal to each vector in B. (Here is 
where we extend the idea presented just prior to the theorem.) Let 


W = Vns1 — Projw, Vn+1 — Projw, Vn+1 — Projw, Yn+1 
(Yn+1 ’ wi) (Vn41 ’ W2) (Vn+1 , Wa) 
= Vna -n W > O_O Oo Wa 
(wi, W1) (W2, W2) (Wn, Wn) 


Observe that w Æ 0 since if w = 0, then B’ will be linearly dependent and therefore 
not a basis for V. To complete the proof, we must show that w is orthogonal to 


6.3 Orthonormal Bases 347 


each vector in B. To see this, let w; be a vector in B. Then 


(w, Wi) = (Vn+1 — PrOjw, Vnt1 — PlOjy,Vnt1 — +++ — PrOjw, Vn+1, Wi) 
(Vn+1; W1) (Vn41, W2) 
= (Vat, Wi) = —__ (Wis Wi) — “(Was Wi) 
(w1, W1) (w2, W2) 
(Vn+1 oy Wi) (Yn+1 ’ Wn) 
= e = L (Wi, Wi) oo lo (Wah, Wi) 
(wi ’ wi) (Wn ’ Wn) 
Now, as each of the vectors in B = {w1, W2, . . . ,Wn} are mutually orthogonal, the 
previous equation reduces to 
(Vn+1; wi) 
(w, Wi) (va Wi) — 0 0 ee a wi) Om eee — 0 
(Wi, Wi) 
= (Vn41, Wi) — (Vn41, Wi) = 0 
Therefore B’ = {w , W2,...,Wn, W} is an orthogonal set of n+ 1 nonzero vectors 


in V. That B’ is a basis for V is due to Corollary 1 of Sec. 6.2. 


From Theorem 7, we also know that every finite dimensional vector space has 


an orthonormal basis. That is, if B = {w1, W2, ..., Wn} is an orthogonal basis, then 
dividing each vector by its length gives the orthonormal basis 
aa { Wi w Wn \ 
wi ll Il we Ih Iwn Il 


Gram-Schmidt Process 


Theorem 7 guarantees the existence of an orthogonal basis in a finite dimensional inner 
product space. The proof of Theorem 7 also provides a procedure for constructing 
an orthogonal basis from any basis of the vector space. The algorithm, called the 
Gram-Schmidt process, is summarized here. 


1. Let B = {v1, V2, ..., Vn} be any basis for the inner product space V. 
2. Use B to define a set of n vectors as follows: 


wi=Vi 
W2 = V2 — prOjy, V2 = V2 — Kya Wa) 
(Wi, W1) 
W3 = V3 — Projw, V3 — projw, V3 
(v3, W1) (v3, W2) 


(wi, wi) | (Wo, Wa) 


Wn = Vn — Projw; Vn — PrOjw, Vn See — Projw,_;Vn 


(Vn, wi) (Vn, w2) (Vn, Wn—1) 
Vn — wi Wo = 
(Wn—1; Wr—1) 


n—l 
(W1, W1) (W2, W2) 


348 Chapter 6 Inner Product Spaces 


3. The set B’ = {w1, Wo, ..., Wn} is an orthogonal basis for V. 
4. Dividing each of the vectors in B’ by its length gives an orthonormal basis for 


the vector space V 
B'= { Wi w Wn \ 
I] will’ Wwe ll? 77 [wall 


A Geometric Interpretation of the Gram-Schmidt Process 


The orthogonal projection proj,u of the vector u onto the vector v, in R”, is the 
projection of u onto the one-dimensional subspace W = span{v}. See Figs. 2 and 3. As 
seen above, to construct an orthogonal basis from the basis B = {v1, v2, v3}, the first 
step in the Gram-Schmidt process is to let w; = v1, and then perform an orthogonal 
projection of v2 onto span{v;}. As a result, w; is orthogonal to w2 = v2 — proj, y2 
Our aim in the next step is to find a vector w3 that is orthogonal to the two-dimensional 
subspace span{w), w2}. This is accomplished by projecting v3 separately onto the one- 
dimensional subspaces span{w,} and span{w2}, as shown in Fig. 7. The orthogonal 
projections are 


projw, V (va, Wi) w and Projw, V (v3, W2) 
3 = ——W1 v3 = —— 
™ (wi, Wi) "a (w2, W2) 


Hence, the orthogonal projection of v3 onto span{w1, w2} is the sum of the projections 


proj, V3 + projy, V3 
also shown in Fig. 7. Finally, the required vector is 


W3 = V3 — (projy, V3 + PrOjy,V3) = V3 — projw, V3 — PrOjw, V3 


which is orthogonal to both w; and w2, as shown in Fig. 7. 

In general, when dim(W) =n > 1, then the Gram-Schmidt process describes 
projecting the vector v,;; onto n one-dimensional subspaces span{w}j}, 
span{w>},..., span{w,,}. Then the vector w,+; that is orthogonal to each of the 
vectors W1, W2,..., Wn is obtained by subtracting each projection from the vector vņ. 


PrOJw; V3 wi 


Figure 7 


6.3 Orthonormal Bases 


| EXAMPLE3 | Let B be the basis for R? given by 


1 —1 —1 
B = {v1, V2, V3} = I |. ie 0 
1 0 1 


Apply the Gram-Schmidt process to B to find an orthonormal basis for R?. 


349 


Solution In this case the inner product on R? is the dot product. Notice that vı - v2 = 0, 
so that the vectors vı and v2 are already orthogonal. Applying the Gram-Schmidt 
process results in w; = vı and w2 = v2. Following the steps outlined above, we 


have 


Next note that vı and v3 are also orthogonal, so that in this case only one projection 


is required. That is, 


WOT WR OVD 
W3 = V3 See eee 
wi: Wi Ni) OND 


= =í -4 
1 
1 0 1 
Then i 
1 —1 I? 
B’ = {w1, W2, W3} = íl ’ 1 ’ -5 
1 0 1 


is an orthogonal basis for R*. See Fig. 8. An orthonormal basis is then given by 


1 —1 —1 
= Be [= 29 1 a 1 uy zj 
Iwa ll Il wo ll Iwa ll /3 | 4 |e 0 | v6 2 


| 

| 

| 

| S 

l V3 — Projw, V3 
| 

| 

| 

| 

| W2 


Figure 8 


350 Chapter 6 Inner Product Spaces 


Solution 


Example 4 illustrates the use of the Gram-Schmidt process on a space of poly- 
nomials. 


Define an inner product on P3 by 


1 
(p,q) = f powa 


Use the standard basis B = {v1, V2, V3, V4} = {1, x, x7, x°} to construct an orthog- 
onal basis for P3. 


First note that B is not orthogonal, since 


1 
cae =| x‘ dx = : 
1 5 


We can simplify some of the work by noting that since the interval [—1, 1] is a 
symmetric interval, 


1 
When p is an odd function, then i p(x) dx =0 
= 


1 1 
When p is an even function, then / P(x) dx = y P(x) dx 
il 0 
Now, since f(x) = x, g(x) = x°, and h(x) = x° are all odd functions, 


1 1 
(v1, V2) = if xdx =0 (V2, V3) = 1 x? dx =0 
= = 


1 


1 1 
(v4, V1) = x? dx =0 (v4, V3) = 1 x? dx =0 
= -1 


1 
Since vı and v2 are orthogonal, proceeding with the Gram-Schmidt process, we 
have 

wi =v and W2 = V2 


Next, to find w3, the required computation is 


= (v3, W1) (v3, W2) 
W3 = V3 — ————-w, — ———— 
(wi, W1) (W2, W2) 
_ V3; Va) (V3, V2) 
Wev) wav) 
(v3, V1) (V2, V3) 


vı 2 
(V1, V1) (V2, V2) 
But we have already noted above that 0 = (v2, v3) and since 


( 
1 1 1 
2 2 2 
(v3, V1) = Ne AX — a) a Oke = 3 and (vi, v1) = dye =P 
= 0 = 


1 


then 
yee 
a 3 


6.3 Orthonormal Bases 351 


To find wa, we first note that since (v4, vı) = 0, wy = v1, and w2 = vo, then 


(V4, W1) (v4, W2) (v4, W3) 
Renn = ty, = SS , — 
(wi, W1) (W2, W2) (W3, W3) 
(v4, V2) (v4, W3) 
= v4 — v- -ws 
(v2, V2) (W3, W3) 


Next, observe that p(x) = x° — 5x3 is an odd function. Hence, 


Consequently, 


An orthogonal basis for P3 is therefore given by 
—_ 2 1 3 3 
B= Be a: — 3, X” - 3x} 
By normalizing each of these vectors, we obtain the orthonormal basis 


w= {2 v6. 3/10 Sh ae } 


oe? 2 4 le ==) 4 5+ 


| EXAMPLE 5 | Let U be the subspace of R4 with basis 


—1 —1 1 

1 0 (0) 

B = {u;, w, u3} = E AE 
(0) 0 1 


where the inner product is the dot product. Find an orthonormal basis for U. 


Solution Following the Gram-Schmidt process, we let w; = u;. Next we have 


O BF VIN ue 
WwW 
ne 


Ne 


WIN 


—1 —1 

UW: W 0 1 

W2 = UW — — W] = 1 = 1 
Wi’: Wi 

0 0 


To facilitate the computations, we replace w2 with 


352 Chapter 6 Inner Product Spaces 


To justify this substitution, note that wı -w2 = 0; that is, multiplying w2 by a 
scalar does not change the fact that it is orthogonal to w;. To find w3, we use the 


computation 
JANI u3 ° W2 
W3 = WwW — = 
OV) W2°W2 
1 —1 1 1 
~ || © 1 1 1 =! 0 
Mme 3 1 | 6 | =) eae 
1 0 2 


As before we replace w3 with 
1 
a 0 
|| 
2 
An orthogonal basis for U is then given by 
-1 1 ] | 1 ] 
1 2 0 


pe 
bi ES A 
0 0 2 


Normalizing each of the vectors of B’ produces the orthonormal basis 


—1 1 1 
wadah 1) oe 
V3 | ize Ee eTe 

0 0 2 


Fact Summary 


1. Every finite dimensional inner product space has an orthonormal basis. 


2. The Gram-Schmidt process is an algorithm to construct an orthonormal 
basis from any basis of the vector space. 


Exercise Set 6.3 


In Exercises 1—8, use the standard inner product on =j =y 
n 1. u= v= 
R”. 2 1 
a. Find proj, u. 
b. Find the vector u — proj,u and verify this 2. u= | 3 | v= | l | 
vector is orthogonal to v. = =e 


1 1 
se [a[i 
1 —2 
THRE 
—1 1 
5. u= 3 }v=] -l 
0 —1 
1 
6. u= 0 y == 
1 —1 
1 0 
7.u= | -1 |v=]| 0 
—1 1 
3 1 
8 u= 2 |v= 
0 —1 


In Exercises 9—12, use the inner product on P2 
defined by 


1 
(p.q) =a P(x)q (x) dx 
a. Find proj, p. ° 


b. Find the vector p — proj, p and verify that this 
vector is orthogonal to q. 


9. p(x) =x? -x +1,q(x)=3x—1 
10. p(x) =x? —x+1, q(x) =2x-1 
11. p(x) = 2x? +1, q(x) = x? -—1 
12. p(x) = —4x + 1,g(x) =x 
In Exercises 13—16, use the standard inner product on 


R”. Use the basis B and the Gram-Schmidt process to 
find an orthonormal basis for R”. 


gasd ANa 
a a={[ 3).[2]} 


1 0 0 
15. B= 0O},; —1 |,| -l 
1 1 —1 


6.3 Orthonormal Bases 353 


16. B = Oj, 14, 


In Exercises 17 and 18, use the inner product on P2 
defined by 


1 
(p.q) = f p(x)q (x) dx 


Use the given basis B and the Gram-Schmidt process 
to find an orthonormal basis for P2. 


17. B={x—1,x +2, x7} 


18. B = {x?—x,x,2x+1} 


In Exercises 19—22, use the standard inner product on 
R” to find an orthonormal basis for the subspace 
span(W). 


1 1 
19. W = 1 |,| —1 
1 —1 
0 -1 
20. W= 1 |,| —1 
1 1 


2. W= al Meee al ae 
1 - 1 

1 =] 0 

=2 3 =1 

22. W = a ae a 
0 =I =1 


In Exercises 23 and 24, use the inner product on P3 
defined by 


1 
(Pq) = p p(x)q (x) dx 


to find an orthonormal basis for the subspace 
span(W). 


23. W = {x,2x + 1} 


24. W = {1,x +2, x3 — 1} 


354 Chapter 6 Inner Product Spaces 


25. In R* with the standard inner product find an 
orthonormal basis for 


l 2 -1 
span d | ° -3 2 
1P- pPI s PI -3 
1 =i 1 


26. In R? with the standard inner product find an 
orthonormal basis for 


2 3 3 1 
span 0 |,| 1 7,} -l J,] 1 
1 1 
27. Let {u,, U2,...,U,} be an orthonormal basis for 


R”. Show that 
Iv? = |veus|? + ++ + ve unl? 
for every vector v in R”. 
28. Let A be ann x n matrix. Show that the 
following conditions are equivalent. 
a. A7! = Al 
b. The row vectors of A form an orthonormal 
basis for R”. 


c. The column vectors of A form an orthonormal 
basis for R”. 


29. Show that an n x n matrix A has orthonormal 
column vectors if and only if A'A = J. 


30. Let A be an m x n matrix, x a vector in R”, and 
y a vector in R”. Show that x» (Ay) = A’x-y. 


31. Show that if A is an m x n matrix with 
orthonormal column vectors, then || Ax || = || x |]. 


32. Show that if A is an m x n matrix with 
orthonormal column vectors and x and y are in 

R”, then (Ax). (Ay) = x-y. 

33. Show that if A is an m x n matrix with 


orthonormal column vectors and x and y are in 
R”, then (Ax)- (Ay) = 0 if and only if x+y = 0. 


34. In Rf with the standard inner product show that 
1 


the set of all vectors orthogonal to both = 
1 


35. 


2 

and = is a subspace. Find a basis for the 
2 

subspace. 

Let S = {uy,..., Um} be a set of vectors in R”. 


Show that the set of all vectors orthogonal to 
every u; is a subspace of R”. 


In Exercises 36—41, a (real) n x n matrix A is called 
positive semidefinite if A is symmetric and u’ Au > 0 
for every nonzero vector u in R”. If the inequality is 
strict, then A is positive definite. 


36. 


37. 


38. 


39. 


40. 
41. 


42. 


Let A be a positive definite matrix. Show that the 
function (u, v) = u’ Av defines an inner product 
on R”. (Note that when A = J this function 
corresponds to the dot product.) 


Let A = : : | Show that A is positive 
definite. 
Show that if A is positive definite, then the 


diagonal entries are positive. 


Let A be an m x n matrix. Show that A'A is 
positive semidefinite. 


Show that a positive definite matrix is invertible. 


Show that the eigenvalues of a positive definite 
matrix are positive. 


Let 


a. Are the vectors vı = K and 
V= A orthogonal? 
—2 -l1 
Let A = | 2 4 | 


b. Find det(Aʻ A). 
c. Show that the area of the rectangle spanned by 


y= | i and v2 = | is /det(A‘A). 


d. Show that the area of the rectangle is | det(A)|. 


e. If vj and v2 are any two orthogonal vectors in 
IR’, show that the area of the rectangle spanned 


by the vectors is | det(A)|, where A is the 
matrix with row vectors vı and v2. 

. Let vı and vz be two vectors in R? that span a 
parallelogram, as shown in the figure. Show 
that the area of the parallelogram is | det(A)|, 
where A is the matrix with row vectors vı 

and v2. 

. If vı, V2, and v3 are mutually orthogonal 
vectors in R?, show that the volume of the box 
spanned by the three vectors is | det(A)|, where 
A is the matrix with row vectors v1, V2, and v3. 


6.4 > Orthogonal Complements 


6.4 Orthogonal Complements 


Ay 


xv 


355 


Throughout this chapter we have seen the importance of orthogonal vectors and bases 
in inner product spaces. Recall that two vectors u and v in an inner product space V 
are orthogonal if and only if 
(u,v) =0 

A collection of vectors forms an orthogonal basis if the vectors are a basis and are 
pairwise orthogonal. In this section we extend the notion of orthogonality to subspaces 
of inner product spaces. As a first step, let v be a vector in an inner product space V 
and W a subspace of V. We say that v is orthogonal to W if and only if 


(v, w) =0 for each vector we W 


As an illustration, let W be the yz plane in the Euclidean space R?. Observe that W 
is closed under addition and scalar multiplication, so that by Theorem 3 of Sec. 3.2 it 
is a subspace. Using the dot product as the inner product on R°, the vector 


1 
e€; = 0 
0 
is orthogonal to W since 
1 0 
0 y | =0 
0 z 


for every y, z € R. Note that any scalar multiple of e; is also orthogonal to W. 
Example 1 gives an illustration of how to find vectors orthogonal to a subspace. 


Let V = R, with the dot product as the inner product, and let W be the subspace 
defined by 
1 
W = span —2 
3 


356 Chapter 6 Inner Product Spaces 


Solution 
5 
<0 
= 
Ox 

=5 

y 0 5 

Figure 1 


Describe all vectors in R? that are orthogonal to W. 


Let 


Thus, any vector in W has the form cw, for some real number c. Consequently, a 
vector 


X 
v= y 
g 


in R? is orthogonal to W if and only if v- w = 0. This last equation is equivalent 
to the equation 


x=2y #3 = 0) 
whose solution set is given by 
S = {(2s —3t,s,t)| s,t € R} 


Therefore, the set of vectors orthogonal to W is given by 


2s — 3t 2 —3 
S= sS steEeR>=<¢s5]| 1 | Fr 0 |ls,teR 
t 0 1 


Letting s = t = 1 gives the particular vector 


Wy = 
which is orthogonal to W since 
=| 1 
vw = 1 je —2) =-)DO+O0C2> Qe 
1 3 


If the vectors in S’ are placed in standard position, then the solution set describes 
a plane in R?, as shown in Fig. 1. This is in support of our intuition as the set of 
vectors orthogonal to a single vector in R? should all lie in a plane perpendicular 
to that vector, which is called the normal vector. 


The set of vectors found in Example 1, orthogonal to the subspace W, is called 
the orthogonal complement of W. The following definition generalizes this idea to 
inner product spaces. 


DEFINITION 1 


Solution 


THEOREM 8 


6.4 Orthogonal Complements 357 


Orthogonal Complement Let W be a subspace of an inner product space V. 
The orthogonal complement of W, denoted by W+, is the set of all vectors in V 
that are orthogonal to W. That is, 


Wt = {v e V | (v,w) =0 for all we W} 


Let V = P; and define an inner product on V by 


1 
(wa) = | poaa(x) ax 
0 
Find W+ if W is the subspace of constant polynomials. 


Let f(x) = a + bx + cx? + dx? be an arbitrary polynomial in P3 and p(x) = k be 
an arbitrary constant polynomial. Then f is in W+ if and only if 


: b d 
Z- kat bx tex? as ydx =k (a4 54549) =0 
0 


Since this equation must hold for all k € R, 


Wi = fa + be box? + ax? 


+24 E 
a — — — = 
2, S35 A 


Notice in Examples 1 and 2 that the zero vector is an element of the orthogonal 
complement WŁ. It can also be shown for these examples that W+ is closed under 
vector space addition and scalar multiplication. This leads to Theorem 8. 


Let W be a subspace of an inner product space V. 


1. The orthogonal complement W+ is a subspace of V. 
2. The only vector in W and WŁ is the zero vector; that is, W N W+ = {0}. 


Proof (1) Let u and v be vectors in W+, and w a vector in W, so that 
(u, w) =0 and (v,w) =0 
Now for any scalar c, we have 
(u + cv, w) = (u, w) + (cv, w) 
= (u, w) + c (v, w) 
=0+0=0 


Thus, u + cv is in W+, and therefore by Theorem 4 of Sec. 3.2, W+ is a subspace 
of V. 


358 Chapter 6 Inner Product Spaces 


PROPOSITION 5 


(2) Let w be any vector in W N WŁ. Then 
(w, w) = 0 


and hence w = 0 (see Definition 1 of Sec. 6.2). Thus, W N W+ = {0}. 


To determine whether a vector v is in the orthogonal complement of a subspace, 
it suffices to show that v is orthogonal to each one of the vectors in a basis for the 
subspace. 


Let W be a subspace of an inner product space V and B = {wj,..., Wm} a basis 
for W. The vector v is in W+ if and only if v is orthogonal to each vector in B. 


Proof First suppose that v is orthogonal to each vector in B. Let w be a vector 
in W. Then there are scalars cy, ..., Cm such that 


W = C1Wi + +: + CmWm 


To show that v is in WŁ, take the inner product of both sides of the previous 
equation with v, so that 


(v, W) = c1 (V, W1) + C2 (V, W2) + +++ + Cm (Y, Wm) 


Since (v, wj) = 0 for all j = 1,2,...,m, we have (v, w) = 0 and hence v € we. 
On the other hand, if v € W+, then v is orthogonal to each vector in W. In 
particular, v is orthogonal to w;, for all j = 1,2,...,m. 


Let V = R4 with the dot product as the inner product, and let 


1 0 
0 1 
W = span = a 
—1 1 
a. Find a basis for W. 
b. Find a basis for W+. 
c. Find an orthonormal basis for R4. 
d. Let 
1 
Vo = 9 
0 
0 


Show that vo can be written as the sum of a vector from W and a vector 
from Wt. 


6.4 Orthogonal Complements 359 


Solution a. Let 


1 0 
w= E and WwW = E 
—1 


Notice that w; and wz are orthogonal and hence by Theorem 5 of Sec. 6.2 are 
linearly independent. Thus, {w1, w2} is a basis for W. 


b. Now by Proposition 5, the vector 


Snwe & 


is in WŁ if and only if v- w; = 0 and v- w2 = 0. This requirement leads to 
the linear system 
x —-z-w=0 
y-zt+tw=0 


The two-parameter solution set for this linear system is 
s+t 
s—t 
S= s,teR 
s 
t 


The solution to this system, in vector form, provides a description of the 
orthogonal complement of W and is given by 


1 1 
1 —1 
i 
W- = span TE 0 
0 1 
Let 

1 1 
1 —1 
w= i and vV = 0 
0 1 


Since vı and v2 are orthogonal, by Theorem 5 of Sec. 6.2 they are linearly 
independent and hence a basis for W+. 


c. Let B be the set of vectors B = {w1, w2, V1, V2}. Since B is an orthogonal 
set of four vectors in R4, then by Corollary 1 of Sec. 6.2, B is a basis for 
R4. Dividing each of these vectors by its length, we obtain the (ordered) 
orthonormal basis for R* given by 


360 


DEFINITION 2 


Chapter 6 Inner Product Spaces 


1 0 1 

1 0 1 1 
B' = {b;, b2, b3, ba} = Z 1 i= i on 1 
—1 1 0 


d. By Theorem 6 of Sec. 6.2 the coordinates of vo relative to B’ are given by 


ci = Vo- b;, for 1 < i < 4. So 


1 1 


CO — = o=0 CIS 
A 2 3 


Now, observe that the first two vectors of B’ are an orthonormal basis for W 
while the second two vectors are an orthonormal basis for W+. Let w be the 


vector in W given by 


1 
1 0 
W = abe = z Fi 
—1 
and u be the vector in W+ given by 
2 
1| 0 
MT Coli r ei = g 1 
1 
Then 
1 2 1 
wt+tu=- 22 c = 7 
=f 3} 1 0 
-1 1 0 


= Vo 


The vector w in Example 3 is called the orthogonal projection of v onto the 
subspace W, and the vector u is called the component of v orthogonal to W. The 


situation, in general, is the content of Definition 2 and Theorem 9. 


Direct Sum Let W; and W2 be subspaces of a vector space V. If each vec- 


tor in V can be written uniquely as the sum of a vector from W; and a vector 
from W3, then V is called the direct sum of W; and W32. In this case we write, 


V= W oO Wp. 


PROPOSITION 6 


THEOREM 9 


6.4 Orthogonal Complements 361 


Let W, and W, be subspaces of a vector space V with V = W; @ W2. Then 
Wi W2 = {0}. 


Proof Letv €e Wi N W2. Then 
v=w +0 and v=0+w 


with w; € W; and w, € W2. Hence, by the uniqueness of direct sum representa- 
tions, we have w, = w2 = 0. 


Projection Theorem If W is a finite dimensional subspace of an inner product 
space V, then 
v=wew 


Proof The proof of this theorem has two parts. First we must show that for any 
vector v € V there exist vectors w € W and ue W+ such that w + u =v. Then 
we must show that this representation is unique. 

For the first part, let B = {w1, ..., Wn} be a basis for W. By Theorem 7 of 
Sec. 6.3, we can take B to be an orthonormal basis for W. Now, let v be a vector 
in V, and let the vectors w and u be defined by 


Ww = (V, W1) W + (V, Wo) W2 + +++ + (V, Wn) Wn and u=v—w 


Since w is a linear combination of the vectors in B, then w € W. To show that u is 
in WŁ, we show that (u, w;) = 0 for each i = 1, 2, ..., n and invoke Proposition 5. 
To this end, let w; be a vector in B. Then 


(u, W;) = (v — w, w;) 


= (v, Wi) — (W, wi) 


= (v, wi) — 5 (v, wj) (wj, Wi) 


j=1 
Since B is an orthonormal basis, 


(w,wp) = 1 and (wj, Wi) =0 fori Æj 


Hence, 

(u, wi) = (v, Wi) — (v, Wi) (Wi, Wi) = 0 
Since this holds for each i = 1,...,n, then by Proposition 5 the vector u € wt 
as claimed. 


For the second part of the proof, let 
v=w+u and è v=w +u 
with w and w’ in W and u and w’ in W+. Subtracting the previous equations gives 


(w-w)+u-—u)=0 


362 


Chapter 6 Inner Product Spaces 


THEOREM 10 


or equivalently, 
wW—-w=u-wu 

Now, from this last equation we know that the vector u — u’ is in W, as it is a linear 
combination of the vectors w and w’, which are in W. However, u — u’ is also in 
W+ since it is the difference of two vectors in Wt. Therefore, by Theorem 8, part 2, 
u — u’ = 0 and hence u = w’. This being the case, we now have w — w= 0, so 
that w’ = w. Thus, we have shown that for every v in V, there are unique vectors 
win W and u in W+ such that v = w + u and hence V = WOW. 


Motivated by the terminology of Example 3, we call the vector w, of Theorem 9, 
the orthogonal projection of v onto W, which we denote by projyv, and call u the 
component of v orthogonal to W. 


Matrices 


In Definition 4 of Sec. 3.2 we defined the null space of an m x n matrix A, denoted 
by N(A), as the set of all vectors x in R” such that Ax = 0. The column space of 
A, denoted by col(A), is the subspace of R” spanned by the column vectors of A. 
In a similar way, the left null space of A, denoted by N(A’‘), is the set of vectors 
y in R” such that A'y = 0. Finally, the row space of A, which we discussed in 
Sec. 4.2, denoted by row(A), is the subspace of R” spanned by the row vectors of 
A. Since the rows of A are the columns of A’, then row(A) = col(A‘). These four 
subspaces 
N(A) N(A‘) col(A) and col(A’) 

are referred to as the four fundamental subspaces associated with the matrix A. 
Theorem 10 gives relationships among them. 


Let A be an m x n matrix. 


1. N(A) = col(A‘)— 
2. N(A‘) = col(A)-— 


Proof (1) Let vj,...,V denote the row vectors of A. So that 
Vi’ X 
Ax = mee i 
Vm’ X 


First let x be a vector in N (A) so that Ax = 0. Then v; -x = 0 fori = 1,2,...,m. 
Thus, x € row(A)t = col(A’)+ and N(A) C col(A’)+. On the other hand, let x 
be a vector in col(A’)+ = row(A)+. Then x- v; = 0, for i=1,...,m, so that 
Ax = 0. Therefore, col(A’')+ € N(A). Hence, N(A) = col(A’)+. 

For part 2, substitute A’ for A in part 1. 


6.4 Orthogonal Complements 363 


Linear Systems 


Let A be an m x n matrix. In light of Theorem 10, we are now in a position to 
provide an analysis of the linear system Ax = b in terms of the geometric structure 
of Euclidean space and the fundamental subspaces of A. As a first step, we describe 
the action of A on a vector x in R”. Since row(A) = col(A’), by Theorem 10, N(A) 
is the orthogonal complement of row(A). Thus, by Theorem 9, a vector x in R” can 
be uniquely written as 
X = Xrow + Xmull 
where Xow is in the row space of A and Xu is in the null space of A. Now, multiplying 
x by A, we have 
AX = A(Xpow + Xni) = AXrow + AXnull 


Since AXny = 0, the mapping T: R” —> R” defined by T(x) = Ax maps the row 
space of A to the column space of A. Observe that no vector in R” is mapped to a 
nonzero vector in N(A‘), which by Theorem 10 is the orthogonal complement of the 
column space of A. See Fig. 2. 


col(A) 
dim r bes dim r 
dim n -r dim m -r 
R” R” 
Figure 2 


We now consider, again from a geometric point of view, the consistency of the 
linear system Ax = b for an m x n matrix A and a given vector b in R”. We have 
already observed in Sec. 3.2 that Ax = b is consistent if and only if b is in the column 
space of A. By Theorem 10, this system is consistent if and only if b is perpendicular 
to the left null space of A, or equivalently, if and only if b is orthogonal to every 
vector in R”, which is orthogonal to the column vectors of A. This sounds a bit 
awkward. However, in cases where a basis for the null space of A‘ consists of only 
a few vectors, we can perform an easy check to see if Ax = b is consistent. As an 
illustration, let 


1 0 0 2 
A= 1 1 and b=| 1 
ai al a -3 


Since 


1 0 
A =]|0 1 -1 then N(4A') = span 1 
0 1 1 


364 


Exercise Set 6.4 


In Exercises 1—8, find the orthogonal complement of 


Chapter 6 Inner Product Spaces 


1 
Now as b is orthogonal to | 1 |, by Proposition 5, b is orthogonal to N(A‘) and 
1 


hence in col(A). Therefore, the linear system Ax = b is consistent. 


Fact Summary 


Let W be a subspace of an inner product space V. 

. The orthogonal complement of the span of a single nonzero vector v in 
3-space is the plane with normal vector v. 

. The orthogonal complement of W is a subspace of V. 

. The only vector common to both W and its orthogonal complement is the 
zero vector. 

. If W is finite dimensional, then the vector space is the direct sum of W and 
its orthogonal complement. That is, V = W @ Wt. 


. If B is a basis for W, then v is in W+ if and only if v is orthogonal to each 
vector in B. 


. If A is an m x n matrix, then N(A) = col(A’)+ and N(A‘) = col(A)+. 


3 0 
W in R” with the standard inner product. 1 2 
7. W = span ia eal ee 
1 w =span {| > |S —1 2 
—2 
2. W = span } l ! k 
: 8. W = span l 0 : 
3 ST E opli pla 
3. W= va | 1 | l 1 1 
=a In Exercises 9-12, find a basis for the orthogonal 
1 complement of W in R” with the standard inner 
4. W =span 0 product. 
2 
2 1 E 
9% W= 1], 1 
5. W = span L ],] 2 ee 1 
—1 0 
=3 0 1 —2 
6. W = span 1],] 1 10. W = span -1 J, 2 
=] 1 —2 


11. W = span ab 


1 
1 
12. W = span 1 


= WN OO 


1 1 


In Exercises 13 and 14, find a basis for the orthogonal 
complement of W in P2 with the inner product 


1 
(p,q) = | P(x)q (x) dx 
0 
13. W = span{x — 1, x7} 
14. W = span{1, x7} 


15. Let W be the subspace of R4, with the standard 
inner product, consisting of all vectors w such that 
wi + w + w3 + w4 = 0. Find a basis for Wt. 

In Exercises 16-21, W is a subspace of R” with the 

standard inner product. If v is in R” and {w,,..., Wm} 

is an orthogonal basis for W, then the orthogonal 
projection of v onto W is given by 


L (v, Wi) 
proiwv = J Ey, 
“a (Wi, Wi) 


Find the orthogonal projection of v onto W. If 
necessary, first find an orthogonal basis for W. 


1 2 
16. W = span Oj;,; 1 
—1 1 
1 
v= | —2 
2 
2 
17. W = span O |,|] -l 
0 
1 
v= 2, 
—3 


6.4 Orthogonal Complements 365 


3 —2 
18. W = span -l j, 2 
0 
5 
v= | -3 
1 
1 —1 
19. W = span 2 |, 3 
1 2 
1 
v=] -3 
5 
1 1 3 
3 (0) 
20. W = span ablar 1 
—1 
4 
0 
la 
0 
3 —6 
0 0 
21. W = span abe 2 
2 4 
v= 


In Exercises 22—25, W is a subspace of R” with the 
standard inner product. 


a. Find Wt. 

b. Find the orthogonal projection of v onto W. 
(See Exercises 16-21.) 

c. Compute u = v — projyv. 

d. Show u is in W+ so v is a sum of a vector in 
W and one in Wt. 

e. Make a sketch of W, WŁ, v, projyv, and u. 


366 


22. 


23. 


24, 


25. 


26. 


27. 


28. 


Chapter 6 Inner Product Spaces 


v-om{(s]}>-[] 
vom i Le 


W = span 1 
0 
1 
v=] 1 
-1 
W = span 2 
4 
2 
v=] l 
1 
Show that if V is an inner product space, then 


V+ = {0} and {0} = V 


Show that if W; and W, are finite dimensional 
subspaces of an inner product space and 
Wi C Wo, then Wi c WE. 


Let V = C[—1, 1] with the inner product 


1 
(fea i, SO dx 


and W = {f e V | f(—x) = f(@)}. 
a. Show that W is a subspace of V. 
b. Show WŁ = {f € V | f(=x) = 


fœ). 


29. 


30. 


31. 


c. Verify that ie N W+ = {0}. 


d. Let g(x) = 5 [ f(x) + f(—x)] and 
h(x) = Żif œ) — f (—x)]. Verify 
g(—x) = g(x) and h(—x) = —h(x), so every 
f can be written as the sum of a function in W 
and a function in Wt. 


Let V = M2 x2 with the inner product 
(A, B) = tr(B' A) 


Let W = {A € V | A is symmetric}. 
a. Show that 


WŁ = {A € V | A is skew symmetric} 


b. Show that every A in V can be written as the 
sum of matrices from W and W+. 


In RÊ? with the standard inner product, the 
transformation that sends a vector to the 
orthogonal projection onto a subspace W is a 


linear transformation. Let W = span { | : \ : 


a. Find the matrix representation P relative to the 
standard basis for the orthogonal projection of 
R? onto W. 


b. Let v= | i } Find projyv and verify the 


result is the same by applying the matrix P 
found in part (a). 
c. Show P? = P. 


If W is a finite dimensional subspace of an inner 
product space, show that (W+)+ = W 


6.5 > Application: Least Squares Approximation 


There are many applications in mathematics and science where an exact solution to 
a problem cannot be found, but an approximate solution exists that is sufficient to 
satisfy the demands of the application. Consider the problem of finding the equation 
of a line going through the points (1, 2), (2, 1), and (3, 3). Observe from Fig. | that 
this problem has no solution as the three points are noncollinear. 

This leads to the problem of finding the line that is the best fit for these three 
points based on some criteria for measuring goodness of fit. There are different ways 
of solving this new problem. One way, which uses calculus, is based on the idea 


6.5 Application: Least Squares Approximation 367 


Figure 1 


of finding the line that minimizes the sum of the square distances between itself and 
each of the points. Another quite elegant method uses the concepts of linear algebra to 
produce the same result. To illustrate the technique, we consider the original problem 
of finding an equation of the form y = mx + b that is satisfied by the points (1, 2), 
(2, 1), and (3, 3). Substitution of these points into the equation y = mx + b yields the 
linear system 


m+b =2 
2m+b = 
3m +b =3 


As noted above, this system is inconsistent. As a first step toward finding an 
optimal approximate solution, we let 


1 1 
A=|2 1 z= || and b=| 1 
3 1 3 
and we write the linear system as Ax = b. From this perspective, we see that the 


linear system is inconsistent as b is not in col(A). Thus, the best we can do is to look 
for a vector W in col(A) that is as close as possible to b, as shown in Fig. 2. 


Figure 2 


368 


Chapter 6 Inner Product Spaces 


Figure 3 


We will soon see that the optimal choice is to let W be the orthogonal projection of b 
onto col(A). In this case, to find W we let 


W =col(A) = span 2 |, 


By Theorem 9 of Sec. 6.4, the vector b can be written uniquely as 
b=wty 


where y is in WŁ. By Theorem 10 of Sec. 6.4, we have WŁ = N(A‘). By row 
reducing A’, the orthogonal complement of W is 


1 
W+ = span —2 
1 


As this space is one-dimensional, the computations are simplified by finding y first, 
which in the terminology of Sec. 6.4 is the component of b orthogonal to W. To find 
y, we use Definition | of Sec. 6.3 and compute the orthogonal projection of b onto 


1 
Wt. Letv= | —2 |, so that 
1 
b-v 1 l 
y= —v =- | -2 
v-v 2 1 
Hence, 
n i. af d 1} 3 
1 5 


Finally, to find values for m and b, we solve the linear system Ax = w, that is, the 
system 


11 TE. 
2 1 HE 4 
3 1 215 


By Theorem 6 of Sec. 3.2, this last linear system is consistent since the vector on the 
right-hand side is in col(A). Solving the linear system, we obtain m = 5 and b = 1, 


giving us the slope and the y intercept, respectively, for the best-fit line y = 5x +1. 


The vector 
m] 1 
b] Ji 


is the least squares solution to the system Ax = b since it produces a line whose total 
squared distances from the given points are minimal, as shown in Fig. 3. Finding the 
line that best fits a set of data points is called linear regression. 


NI 


wt 


w2 


Figure 4 


col(A) 


6.5 Application: Least Squares Approximation 369 


Least Squares Solutions 


We now consider the general problem of finding a least squares solution to the m x n 
linear system Ax = b. An exact solution exists if b is in col(A); moreover, the solution 
is unique if the columns of A are linearly independent. In the case where b is not in 
col(A), we look for a vector x in R” that makes the error term ||b — Ax || as small 
as possible. Using the standard inner product on R” to define the length of a vector, 
we have 


|| b — Ax ||? = (b — Ax) - (b — Ax) 
= [by — (Ax)? + [b2 — (Ax)2]? + +++ + [bm — (AX) mI? 


This equation gives the rationale for the term least squares solution. 

To find the least squares solution ¥ to the linear system Ax = b, we let W = 
col(A). As W is a finite dimensional subspace of R”, by Theorem 9 of Sec. 6.4, the 
vector b can be written uniquely as 


b=w,+w2 


where w; is the orthogonal projection of b onto W and wy is the component of b 
orthogonal to W, as shown in Fig. 4. 


We now show that the orthogonal projection minimizes the error term || b — Ax ||, 
for all x in R”. First, we have 


lb — Ax ||? = |] wi + w2 — Ax ||? 
= (w2 + (Ww; — AX), w2 + (Ww; — AX)) 
= (w2, W2) + 2 (w2, W1 — Ax) + (w; — Ax, w1 — Ax) 


Since w; and Ax are in W and wy is in W+, the middle term vanishes, giving 


Ib — Ax ||? = (wo, w2) + (wi — Ax, wi — Ax) 
= || We ||? + Ilwi — Ax |? 


The quantity on the right-hand side is minimized if x is any solution to 
Ax =wW 1 


Since w; is in col(A), this linear system is consistent. We call any vector x in R” such 
that Ax = w; a least squares solution of Ax = b. Moreover, the solution is unique 
if the columns of A are linearly independent. 

Occasionally, as was the case for the example at the beginning of this section, 
it is possible to find w; directly. The least squares solution can then be found by 
solving Ax = w1. In most cases, however, the vector w; is hard to obtain. Solving 
the normal equation 


A‘ Ax = A'b 


circumvents this difficulty. 


370 


Chapter 6 Inner Product Spaces 


THEOREM 11 


Solution 


Let A be an m x n matrix and b a vector in R”. A vector X in R” is a solution to 
the normal equation 

A' Ax = A'b 
if and only if it is a least squares solution to Ax = b. 


Proof From the discussion just before the theorem, we know that a least squares 
solution £ to Ax = b exists. By Theorem 9 of Sec. 6.4, there are unique vectors 
w, in W = col(A) and wo in W+ such that b = w; + w2. 

First assume that X is a least squares solution. Since w2 is orthogonal to 
the columns of A, then Aw = 0. Moreover, since X is a least squares solution, 
AX = w1. Therefore, 


A‘ AX = A'w, = A' (b — w2) = A'b 
so that X is a solution to the normal equation. 


Conversely, we now show that if X is a solution to A’ Ax = A'b, then it is also 
a least squares solution to Ax = b. Suppose that A’ Ax = A'b, or equivalently, 


A'(b — Ax) = 0 


Consequently, the vector b — Ax is orthogonal to each row of A’ and hence to 
each column of A. Since the columns of A span W, the vector b — Ax is in Wt. 
Hence, b can be written as 

b = Ax + (b — Ax) 
where AX is in W = col(A) and b — AX is in WŁ. Again, by Theorem 9 of Sec. 6.4, 
this decomposition of the vector b is unique and hence AX = w1. Therefore, X is 
a least squares solution. 


Let 
—2 3 
A= 1 -—2 and b=] =i 
1 -1 


a. Find the least squares solution to Ax = b. 


b. Find the orthogonal projection of b onto W = col(A) and the decomposition 
b = w; + Wo, where w; is in W and wo is in WH. 


a. Since the linear system Ax = b is inconsistent, the least squares solution is 
the best approximation we can find. By Theorem 11, the least squares solution 
can be found by solving the normal equation 


A’ Ax = A'b 


In this case the normal equation becomes 


E 1 A E E E 1 il E 
3 -2 -1 pee ||| 3 -2 -I 5 


6.5 Application: Least Squares Approximation 371 


which simplifies to 
6 —9 x || || =ll 
=9 ila y || 3 
The matrix on the left-hand side is invertible, so that 
Balle 1 14 9 -1 
alls tS 9 6 3 
The least squares solution is then given by 
13 
a x 3 
xXo= = 
El 
b. To find the orthogonal projection w; of b onto col(A), we use the fact that 
Ww, = Ax. So 


—2 3 13 1 1 
wi = {2 | à | TESS 
L 3 I a 
We now find w2 from the equation w2 = b — w1, so that 
1 1 2 1 
w =b-w = —1 A —5 = — 1 
2| 2 e 
The decomposition of b is then given by 
1 1 
1 2, 
b = w +W = 3 =5 | + 3 1 
4 1 


Note that w2 is orthogonal to each of the columns of A. 


Linear Regression 


Example 2 illustrates the use of least squares approximation to find trends in data sets. 


The data in Table 1, which are also shown in the scatter plot in Fig. 5, give the 
average temperature, in degree celsius (°C), of the earth’s surface from 1975 through 
2002.* Find the equation of the line that best fits these data points. 


*Worldwatch Institute, Vital Signs 2006-2007. The trends that are shaping our future, W. W. Norton and 
Company, New York London, 2006. 


372 


Chapter 6 Inner Product Spaces 


Table 1 
1975 | 13.94 | 1985 | 14.03 | 1994 | 14.25 
1976 | 13.86 | 1986 | 14.12 | 1995 | 14.37 
1977 | 14.11 | 1987 | 14.27 | 1996 | 14.23 
1978 | 14.02 | 1988 | 14.29 | 1997 | 14.40 
1979 | 14.09 | 1989 | 14.19 | 1998 | 14.56 
1980 | 14.16 | 1990 | 14.37 | 1999 | 14.32 


Temperature 


1981 | 14.22 | 1991 | 14.32 | 2000 | 14.31 
1982 | 14.04 | 1992 | 14.14 | 2001 | 14.46 
1983 | 14.25 | 1993 | 14.14 | 2002 | 14.52 


1984 | 14.07 


Year 


Figure 5 


Solution Denote the data points by (x;, y;), fori = 1,2,..., 28, where x; is the year starting 
with x; = 1975 and y; is the average global temperature for that year. A line with 
equation y = mx + b will pass through all the data points if the linear system 

m(1975) +b = 13.94 
m(1976) +b = 13.86 


m(2002) +b = 14.52 


has a solution. In matrix form, this linear system becomes 


1975 1 13.94 
1976 1 | e 13.86 
. b = . 


2002 1 14.52 


Since the linear system is inconsistent, to obtain the best fit of the data we seek 
m|. 5 

values for m and b such that x = | b | is a least squares solution. The normal 

equation for this system is given by 


975 1 E 
Po E al 
= 1 Page 1 


1 
1975 «-- 2 | | 
2002 1 ioe 


1 


6.5 Application: Least Squares Approximation 373 


which simplifies to 


|) 7953228 


110,717,530 55,678 m 
Bl- 398.05 


55,678 28 


The least squares solution is 


me 0.0168609742 
b | | —19.31197592 


The line that best fits the data is then given by y = 0.0168609742x — 19.31197592, 
as shown in Fig. 6. 


Temperature 


Year 


Figure 6 


The procedure used in Example 2 can be extended to fit data with a polynomial 
of any degree n > 1. For example, if n = 2, to find the best-fit parabola of the form 
y = ax? + bx + c for a set of data points requires finding the least squares solution 
to an n x 3 linear system. See Exercise 6. 


Fourier Polynomials 


A trigonometric polynomial of degree n is an expression in cosines and sines of 
the form 


ao + a, cos x + bı sin x + az cos 2x + bz sin 2x + --- + an cosnx + by sin nx 


where the coefficients ag, a1, b1, a2, b2, ..., An, bn are real numbers. Let PC[—n, x] 
denote the vector space of piecewise continuous functions on the interval [~ x, 1]. 
The vector space PC[—x, 1] is an inner product space with inner product defined by 


T 


(f, 8) = F(x) g(x) dx 


Suppose now that given a piecewise continuous function f defined on [—x, x], which 
may or may not be a trigonometric polynomial, we wish to find the trigonometric 
polynomial of degree n that best approximates the function. 


374 


Chapter 6 Inner Product Spaces 


To solve this problem using linear algebra, let W be the subspace of PC[—Z, m] of 
trigonometric polynomials of degree n or less. Let fo(x) = 1/V2n, and for k > 1, let 


1 1 
Sk(x) = —= cos kx and g(x) = —= sin kx 


Jr Jr 
Define the set B by 


B = { fo, fis P2s ---, fas 81, 82, <- -, Bn} 


= f NE cosx, 7 cos2x,..., a cosnx, 7 sin x, 5 sin2x,..., 7 sinnx} 
It can be verified that relative to the inner product above, B is an orthonormal basis 
for W. Now let f be a function in PC[—n, x]. Since W is finite dimensional, f has 
the unique decomposition 


f= fwt fwt 
with fw in W and fy. in W+. Since B is already an orthonormal basis for W, then 
fw can be found by using the formula for the orthogonal projection given in the proof 
of Theorem 9 in Sec. 6.4. In this case we have 


fw = (f, fo) fot (fe fd fit o +P fn) fn AF 81) 81 Fo HAF, 8n) 8n 


We now claim that fw, defined in this way, is the best approximation for f in W. 
That is, 
lf—-fwil<ilf-wl foral wew 


To establish the claim, observe that 


lf —wil? = || fw + fy- w|? 
= || fy + fw —w) |? 
= (fw. + (fw —w), fyi + (fw — w)) 
= (fyl, fw) +2 (fyi, fw — w) + (fw —w, fw — w) 


The middle term of the last equation is zero since fy. and fw — w are orthogonal. 
So 


If- wI? =| fwi |) +I fw- wi 
Observe that the right-hand side is minimized if w = fw, that is, if we choose w to 
be the orthogonal projection of f onto W. The function fw is called the Fourier 
polynomial of degree n for f. 


Let i n 
= aS 
fay ={ ; <x < 


O<x<t 
Find the Fourier polynomial for f of degree n = 5. 


6.5 Application: Least Squares Approximation 375 


Solution The graph of y = f(x) is shown in Fig. 7. Since f(x) is an odd function and 
x(x) is an even function for k > 0, the product f(x) f(x) is also an odd function. 
Hence, the integral on any symmetric interval about the origin is 0, and we have 


(f, fr) =0 fork >0 


Now for k > 1, we have 


T 


E (fg) = |] Fs) dx 
A Y 1 i 
= ——— sinkrdxr+ — | sin kx dx 
vT =a Ju 0 
i l [2 — 2 cos kn] 
=p l == — ZCOS KT 
x x kym 
z 7 0 if k is even 
=I if k is odd 
Figure 7 Therefore, the Fourier polynomial of degree 5 that best approximates the function 


f on the interval [—1, x] is 


a 4 Aa 
p(x) = — sinx + — sin3x + — sin 5x 
T 3T Su 


In Fig. 8 we see the function and its Fourier approximations for n = 1, 3, and 5. 


Pi(x) = 4 sin x p(x) = 4 sinx + + sin 3x p(x) = 4 sinx + + sin3x + + sin 5x 
Figure 8 
Exercise Set 6.5 O _ __ _  _ 
1. Let a. Find the least squares solution to Ax = b. 
1 3 4 b. Find the orthogonal projection of b onto 
A=-|1 3 and b=] 1 W = col(A) and the decomposition of the 
2 3 5 vector b = w; + w2, where wy is in W and w2 


is in Wt. 


3. 


4. 


Chapter 6 Inner Product Spaces 


a. Find the least squares solution to Ax = b. 

b. Find the orthogonal projection of b onto 
W = col(A) and the decomposition of the 
vector b = wı + w2, where w; is in W and w2 
is in Wt. 

The table gives world hydroelectricity use in 

thousands of terawatthours. 


1965 927 1990 2185 
1970 1187 1995 2513 
1975 1449 2000 2713 
1980 1710 2004 2803 
1985 2004 


a. Sketch a scatter plot of the data. 


b. Find the linear function that is the best fit to 
the data. 


The table gives world infant mortality rates in 
deaths per 1000 live births. 


157 1985 78 

141 1990 70 

119 1995 66 

104 2000 62 
93 2005 57 
87 


a. Sketch a scatter plot of the data. 


b. Find the linear function that is the best fit to 
the data. 


5. The table gives world population in billions. 


2.56 
3.04 
3.71 
4.46 
5.28 
6.08 


a. Sketch a scatter plot of the data. 


b. Find the linear function that is the best fit to 
the data. 


. The table gives the worldwide cumulative HIV 


infections in millions. 


1980 0.1 29.8 
1982 0.7 40.9 
1985 2.4 57.9 
1987 4.5 67.9 
1990 10 82.7 
1992 16.1 


a. Sketch a scatter plot of the data. 


b. Find a curve of the form y = ax? + bx +c 
that best fits the data. 


. Let f(x) = x on the interval —m < x < v. 


a. Find the Fourier polynomials for f of degrees 
n = 2,3,4, and 5. 

b. Sketch y = f(x) along with the polynomials 
found in part (a). 


. Let 


x if0<x<t 
Ot i if -nm<x<0O 


6.6 Diagonalization of Symmetric Matrices 377 


a. Find the Fourier polynomials for f of degrees 10. Let A be an m x n matrix with rank(A) = n, and 


n = 2, 3,4, and 5. 


suppose A = QR is a QR factorization of A. 


b. Sketch y = f(x) along with the polynomials (See Exercise 9, Review Exercises for Chapter 6.) 


found in part (a). 


9. Let f(x) = x? on the interval =n < x < x. 


Show that the best least squares solution to the 
linear system Ax = b can be found by back 
substitution on the upper triangular system 


a. Find the Fourier polynomials for f of degrees Rx = O'b. 


n = 2, 3,4, and 5. 


b. Sketch y = f(x) along with the polynomials 


found in part (a). 


6.6 > Diagonalization of Symmetric Matrices 


In Sec. 5.2 a methodology was given for diagonalizing a square matrix. A charac- 
terization was also provided to determine which n x n matrices were diagonalizable. 
Recall, specifically, from Theorem 2 of Sec. 5.2 that an n x n matrix is diagonaliz- 
able if and only if it has n linearly independent eigenvectors. As we have seen, the 
application of this theorem requires finding all eigenvectors of a matrix. In certain 
cases, however, we can tell by inspection if a matrix is diagonalizable. An example of 
such a case was given in Example 4 of Sec. 5.2, where it was shown that any 2 x 2 
real symmetric matrix is diagonalizable with real eigenvalues. That this is the case in 
general is the subject of this section. 

In the remarks preceding Example 8 of Sec. 3.1, we defined the set of com- 
plex numbers C. The proof of our main result requires that the reader be familiar 
with some of the terminology and notation from complex variables. In particular, if 
z =a + bi is a complex number, then the conjugate of z, denoted by Z, is given by 
Z=a-—bi. 

Two complex numbers are equal if and only if their real and imaginary parts are 
equal. From this we know that a complex number z = Z if and only if z is a real 
number. To see this, first suppose that z = z. Then bi = —bi or 2bi = 0 and hence 
b = 0. We therefore have z = a + Oi and z is a real number. Conversely, if z is a real 
number, then z = a+ 0i =a and Z =a — Oi = a so that z = Z. 

We can also define this bar notation for vectors and matrices. So if v is a vector 
with complex components and M is a matrix with complex entries, then 


vI 411 412 din 
- V2 == a21 a22 An 
v= . and M= 

Un Am a\2 ae Amn 


We are now ready to state our main result. 


378 


Chapter 6 Inner Product Spaces 


THEOREM 12 


The eigenvalues of an n x n real symmetric matrix A are all real numbers. 


Proof Let v be an eigenvector of A corresponding to the eigenvalue ^. To show 
that X is a real number, we will show that X =. We first consider the matrix 
product (¥’ Av)’, which by Theorem 6 of Sec. 1.3 can be written as 


(V Av)’ =v AV 


Since A is symmetric, A‘ = A. Also since A has real entries, then A=A. 
Therefore, 


v Aly = v' AV = v' AV = v' Av 
Now as v is an eigenvector of A corresponding to the eigenvalue ^, then Av = dv, 
so that 

v Av = VNV = VAV = AVV 
Alternatively, the original expression can be evaluated by 

V Avy = (VNV) = WW = NVV 
Equating these results gives 
AV V= AVV that is (A-dv'¥=0 


Since v is an eigenvector of A, and therefore nonzero, so is v'v. By an extension 
to the complex numbers of Theorem 2 (part 4) of Sec. 3.1, we have ^ — X = 0; 
hence = 4X, establishing that is a real number. 


One consequence of Theorem 12 is that the eigenvectors of a real symmetric 
matrix have real components. To see this, let A be a symmetric matrix with real 
entries and v an eigenvector corresponding to the real eigenvalue ^ = a. Observe that 
v is a vector in the null space of the real n x n matrix A — aI. By Theorem 7 of 
Sec. 3.2, N(A —al) is a subspace of R”. Thus, v being a vector in R”, has real 


components as claimed. 


Let A be the symmetric matrix defined by 


2 0 2 
A=| 0 0 =2 
2 =2 1 


Verify that the eigenvalues and corresponding eigenvectors of A are real. 


Solution 


THEOREM 13 


6.6 Diagonalization of Symmetric Matrices 379 


The characteristic equation of A is 
det(A — AI) = -x3 4 3 eer 
After factoring the characteristic polynomial we obtain 
(A—-D)A+2Q—4)=0 


Thus, the eigenvalues of A are 4; = 1, ^2 = —2, and 3 = 4. 
To find eigenvectors corresponding to ^ = 1, we find the null space of A — T. 
To do this, we see that 


1 ~ 2 1 0 2 
A-I=j,0 —-1 —-2 reduces to @ i 2 
2 -2 0 0 0 0 
—2 
Thus, an eigenvector corresponding to ^; = 1 is vı = | —2 |. In a similar way 
1 
we have that eigenvectors corresponding to ^2 = —2 and ^3 = 4 are, respectively, 
1 —2 
V2 = —2 and V3 = 1 
—2 —2 


Orthogonal Diagonalization 


In Sec. 6.1 we showed that two vectors u and v in R” are orthogonal if and only if their 
dot product u- v = 0. An equivalent formulation of this condition can be developed 
by using matrix multiplication. To do this, observe that if u and v are vectors in R”, 
then v'u is a matrix with a single entry equal to u- v. Hence, we know that u and v 
are orthogonal if and only if v'u = 0. 

Theorem 13 shows that eigenvectors that correspond to distinct eigenvalues of a 
real symmetric matrix are orthogonal. 


Let A be a real symmetric matrix and vı and v2 be eigenvectors corresponding, 
respectively, to the distinct eigenvalues ^; and ^2. Then v; and v2 are orthogonal. 


Proof We have already shown that vı and v2 are vectors in R”. To show that 
they are orthogonal, we show that vi v2 = 0. Now, since 2 is an eigenvalue of A, 
then Av = 2V2, so that 


vi Avo = VMV = MV V2 


380 


Chapter 6 Inner Product Spaces 


Solution 


Also since A‘ = A, 
vi Avo = vi A'v2 = (Avi) v2 = MVI V2 
Equating the two expressions for v| Av2, we obtain 
(A1 — d2)viV2 = 0 


Since 4; Æ dz then 4; — ^2 Æ 0. Hence, by Theorem 2, part 4, of Sec. 3.1, we 
have vi v2 = 0, which, by the remarks preceding this theorem, gives that vı is 
orthogonal to v2. 


Let A be the real symmetric matrix given by 


1 0 0 
A=|]0 0 1 
0 1 0 


Show that the eigenvectors corresponding to distinct eigenvalues of A are 
orthogonal. 


The characteristic equation of A is 


det(A — XI) = —Q = 1+1) = 0 


so the eigenvalues are >; = 1 and ^2 = —1. Then the eigenspaces (see Sec. 5.1) 
are given by 
1 
Vy; = span oig and V,, = span —1 
0 1 1 
1 
Since every vector in V), is a linear combination of u = | 0 | andv= | 1 |, 
0 1 
and w = | —1 | is orthogonal to both u and v, then by Proposition 5 of Sec. 6.4, 


1 
w is orthogonal to every eigenvector in V),. Hence, every eigenvector in V}, is 
orthogonal to every eigenvector in V}. 


In Example 2, we showed that every vector in the eigenspace V}; is orthogonal 
to every vector in the eigenspace V,,,. Notice, moreover, that the basis vectors for Vy, 
are orthogonal to one another. In this case, the matrix has a special factorization. We 
normalize the spanning vectors of the eigenspaces to obtain 


17 | 


| 


and 


sl- s-e 


DEFINITION 1 


THEOREM 14 


6.6 Diagonalization of Symmetric Matrices 381 


Using these vectors, we construct the matrix 


1 0 0 
0 + -4L 
P= V2 J2 
o & + 
V2 V2 
which is then used to diagonalize A. That is, 
1 0 0 
sa Pa @ opie 2 OS 10 0 
PAP = Z Z 001 0 Zz oe We oe oO 
-L L 0 1 0 £ L 0 0 -I 
0 -7 Z 07 zZ 


Observe in this case that the diagonalizing matrix P has the special property that 
PP' = 1, so that P7! = P*. This leads to Definition 1. 


Orthogonal Matrix A square matrix P is called an orthogonal matrix if it is 
invertible and P7! = P*. 


One important property of orthogonal matrices is that the column (and row) 
vectors of an n x n orthogonal matrix are an orthonormal basis for R”. That is, the 
vectors of this basis are all mutually orthogonal and have unit length. 

As we mentioned at the beginning of this section, one particularly nice fact about 
symmetric matrices is that they are diagonalizable. So by Theorem 2 of Sec. 5.2 a 
real symmetric matrix has n linearly independent eigenvectors. For the matrix A of 
Example 2, the eigenvectors are all mutually orthogonal. Producing an orthogonal 
matrix P to diagonalize A required only that we normalize the eigenvectors. In many 
cases there is more to do. Specifically, by Theorem 13, eigenvectors corresponding 
to distinct eigenvalues are orthogonal. However, if the geometric multiplicity of an 
eigenvalue ^ is greater than 1, then the vectors within V, (while linearly indepen- 
dent), might not be mutually orthogonal. In this case we can use the Gram-Schmidt 
process, given in Sec. 6.3, to find an orthonormal basis from the linearly independent 
eigenvectors. The previous discussion is summarized in Theorem 14. 


Let A be an n x n real symmetric matrix. Then there is an orthogonal matrix P 
and a diagonal matrix D such that P~'AP = P'AP = D. The eigenvalues are the 
diagonal entries of D. 


The following steps can be used to diagonalize an n xn real symmetric 
matrix A. 


382 Chapter 6 Inner Product Spaces 


Solution 


1. Find the eigenvalues and corresponding eigenvectors of A. 


2. Since A is diagonalizable, there are n linearly independent eigenvectors. If nec- 
essary, use the Gram-Schmidt process to find an orthonormal set of eigenvectors. 


3. Form the orthogonal matrix P with column vectors determined in Step 2. 
4. The matrix P~'AP = P'AP = D is a diagonal matrix. 


1 1 0 
Find an orthogonal matrix P such that P~'AP is a diagonal matrix. 


The characteristic equation for A is given by 


det(A — A) = +3342 = — 00 


Thus, the eigenvalues are }; = —1 and ^2 = 2. The corresponding eigenspaces are 
-1 -1 
V\; = span 1RR 0 and V,, = span 1 
0 1 1 
Let B be the set of vectors 
1 -1 —1 
B = {v1, v2, V3} = IE ie 0 
1 0 1 


Since B is a linearly independent set of three vectors, by Theorem 2 of Sec. 5.2, 
A is diagonalizable. To find an orthogonal matrix P which diagonalizes A, we use 
the Gram-Schmidt process on B. This was done in Example 3 of Sec. 6.3, yielding 
the orthonormal basis 


1 =I =|| 


1 
= —| 1 as 
eee | | 


= tS NS 


Observe that P is an orthogonal matrix with P~! = P’. Morevover, 
2 0 

P'AP=|0 -1 0 

(0) 0 -l 


S 


Fact Summary 


6.6 Diagonalization of Symmetric Matrices 383 


Let A be ann x n real symmetric matrix. 


. The eigenvalues of A are all real numbers. 


. The eigenvectors corresponding to distinct eigenvalues of A are orthogonal. 


. The matrix A is diagonalizable. 


. There is an orthogonal matrix P such that D = P~'AP = P'AP, where D 
is a diagonal matrix with diagonal entries the eigenvalues of A. 


Exercise Set 6.6 


In Exercises 1—4, verify that the eigenvalues of the 
symmetric matrix are all real numbers. 


-1 3 
vae[ 
1 2 0 
3,.A=]2 -1 2 
0 2-1 
1 1 -—2 
4. A= 1 -l1 2 
—2 2 1 


In Exercises 5—8, verify that the eigenvectors of the 
symmetric matrix corresponding to distinct 
eigenvalues are orthogonal. 


In Exercises 9—12, find the eigenspaces of the n x n 
symmetric matrix, and verify that the sum of the 
dimensions of the eigenspaces is n. 


1 02 
9% A-|0 —-1 0 
2 0 1 
1 0 1 
10. 4=]| 0 -1 0 
1 0 1 
2 1 1 1 
dy «ae 1 
WASA 4 1 9 
i 1 =| 
1 0 00 
0o22 00 
12. A=] ò o -10 
0 0 01 


In Exercises 13—16, determine whether the matrix is 
orthogonal. 


v3 1 
= 2 2 

a a=| 2 a 
2 2 


1 -4 
a= | i ei 


384 


15. 


16. 


Chapter 6 Inner Product Spaces 
v2 V2 
=~ os 8 
A=|_% & 4g 
2 2 
0 0 1 
2 2 1 
3 3 3 
— 2 2 1 
A=| -3 3533 
1 
3 0 1 


In Exercises 17—22, for the given matrix A find an 
orthogonal matrix P and a diagonal matrix D such 


that 


19. 


20. 


21. 


22. 


23. 


24. 


25. 


26. 


D = P'AP. 

3 4 
seers 

5 2 
els 

-1 
Aa 

1 2 
aol] 

1 -1 1 
A=]|-1 -1 1 

1 1 1 

1 -1 
A= 0 -1 0 

-1 1 
Show that if A and B are orthogonal 


matrices, then AB and BA are orthogonal 
matrices. 


Show that if A is an orthogonal matrix, then 
det(A) = +1. 


Show that if A is an orthogonal matrix, then A’ is 
an orthogonal matrix. 


Show that if A is an orthogonal matrix, then A7! 
is an orthogonal matrix. 


27. a. Show that the matrix 
a= || 208 © —sind 
~ | sin@ ~~ cos@ 


is orthogonal. 


b. Suppose that A is a 2 x 2 orthogonal matrix. 
Show that there is a real number 0 such that 


_ | cos®@ —sin@ 
~ | sin@  cos®ð 
or i 
a= | C8 8 sin 0 
~ | sin@ —cosé 


(Hint: Consider the equation A'A = 1.) 


c. Suppose that A is an orthogonal 2 x 2 matrix 
and T: R? — R? is a linear operator defined 
by T(v) = Av. Show that if det(A) = 1, then 
T is a rotation and if det(A) = —1, then T isa 
reflection about the x axis followed by a 
rotation. 


28. Matrices A are B are orthogonally similar if there 
is an orthogonal matrix P such that B = P'AP. 
Suppose that A and B are orthogonally similar. 

a. Show that A is symmetric if and only if B is 
symmetric. 

b. Show that A is orthogonal if and only if B is 
orthogonal. 


29. Suppose that A is an n x n matrix such that there 
exists a diagonal matrix D and an orthogonal 
matrix P such that D = P'AP. (Matrix A is 
called orthogonally diagonalizable.) Show that A 
is symmetric. 


30. Suppose A is invertible and orthogonally 
diagonalizable. Show that A~! is orthogonally 
diagonalizable. (See Exercise 29.) 


31. Let A be ann x n skew-symmetric matrix. 


a. If v is in R”, expand v’v in terms of the 
components of the vector. 

b. Show that the only possible real eigenvalue of 
A is ` = 0. [Hint: Consider the quantity 
v (Av). ] 


6.7 Application: Quadratic Forms 385 


6.7 > Application: Quadratic Forms 


When working with complicated algebraic expressions, mathematicians will often 
attempt to simplify problems by applying transformations designed to make these 
expressions easier to interpret, or at least better suited to the task at hand. In this 
section we show how certain transformations of the coordinate axes in R? can be 
used to simplify equations that describe conic sections, that is, equations in x and 
y whose graphs are parabolas, hyperbolas, circles, and ellipses. As an illustration, 
consider the equation 
x?—4x+ y*?-—6y—-3=0 

To simplify this equation, we complete the square on x* — 4x and y? — 6y to obtain 


(x? —4x +4) + GO? —6y +9) =3 4449 


that is, 
(x — 2)? + (y- 3) = 16 


—10 


Figure 1 


This last equation describes a circle of radius 4 centered at the point (2,3). The graph 
is shown in Fig. 1. To further simplify this equation, we can translate the coordinate 
axes by means of the equations 


x =x-2 and y=y-3 
The equation of the circle then becomes 
w+) =16 


This is the equation of the circle in standard position in the x'y' plane with center at 
the origin. 


Rotation of Axes 
The most general quadratic equation in two variables has the form 


ax? + bxy + cy? +dx +ey+ f =0 


386 


Chapter 6 Inner Product Spaces 


where a,b,c,d,e, and f are real numbers such that at least one of a,b, or c is 
not zero. The graph of a quadratic equation in x and y is a conic section (including 
possible degenerate cases), the particular one being dependent on the values of the 
coefficients. When b Æ 0, the conic section is rotated from standard position. The 
expression 


ax? + bxy + cy? 
is called the associated quadratic form. For example, the quadratic equation 


2x? + 5xy — 7y? +2x —4y+1=0 
has an associated quadratic form given by 
2x? + Sxy — Ty? 


The quadratic equation 


ax? + bxy +cy’+dx +ey+ f =0 
is also given in matrix form by setting 


eE] [ii] = ef] 


Then the quadratic equation above is equivalent to 


NIS Q 
fay NIS 


xAx+bx+f=0 
The quadratic form (in matrix form) is then given by 
x’ Ax 


As an illustration, the quadratic equation 2x? + Sxy + y? + 3x — y+ 1 = 0 in matrix 


form is given by 
[p] -u| 5 |+1=0 
X y 


IE] 


Observe that the matrix A, for any quadratic equation in two variables, is symmetric; 
that is, A’ = A. This fact enables us to develop a transformation that we can use 
to simplify the equation. Specifically, the map we desire will rotate the coordinate 
axes by the precise angle needed to situate the conic section in standard position with 
respect to a new coordinate system. 

To produce such a mapping, first recall from Theorem 14 of Sec. 6.6 that if A is a 
real symmetric matrix, then there exists an orthogonal matrix P and a diagonal matrix 
D such that A = PDP~! = P D P'. Next, we need to examine which orthogonal 2 x 2 
matrices are rotations. 


= NIN 


2 
wo 


2 


The associated quadratic form is 


= NIN 


2 
ol 


2 


THEOREM 15 


6.7 Application: Quadratic Forms 387 


Now by Exercise 27(b) of Sec. 6.6, a real orthogonal 2 x 2 matrix has the form 


E | cosð® —sin@ B= | cos 0 sin ð 


= : or : 
sin ð cos 0 sin8 —cosd 


Next, recall from Sec. 4.6 that B is the matrix representation, relative to the standard 
basis for R?, of a linear operator which rotates a vector in the plane by © rad. The 
matrix B’ is not a rotation (relative to any basis). To see this, let Q = {v,, v2} be a 
basis for R? and 6 = 0. Then 
,_ fl 0 
Bim | g-i 


Relative to the basis Q, this matrix produces a reflection through the line spanned by 
vı. For example, if Q is the standard basis for R?, then B’ is a reflection through the 
x axis. These results are summarized in Theorem 15. 


Let B be a real orthogonal 2 x 2 matrix. The change of coordinates given by 
raed 
y y 
is a rotation if and only if det(B) = 1. 


We are now in a position to analyze quadratic equations in two variables. Start 
with C a conic section with equation 


x Ax+b’x+ f=0 


Let P be the orthogonal matrix that diagonalizes A, so that 


p t IM 0 

A= PDP where p=| 0%» 
with i; and i» being the eigenvalues of A. As P is orthogonal, by the above remarks 
on the form of P, its determinant is either +1 or —1. If det(P) = —1, then interchange 


the column vectors of P, along with the diagonal entries of D. Since 


T. : T 
| sind cosé _ cos (8-7) —sin (0-5) 
—cos@  sin® sin (0 — 3) cos (6 — 3) 

a rearrangement of the column vectors of P is a rotation. To obtain the equation for C 
in the x’y’ coordinate system, substitute x = Px’ into x‘ Ax + b'x + f = 0 to obtain 


(Px) A (Px) +b’ Px’ + f =0 


By Theorem 6 of Sec. 1.3, if the product of A and B is defined, then (AB)! = BA’, 
and since matrix multiplication is associative, we have 


(x’)' P'APx’ +b’ Px’ + f =0 that is, (x’)' Dx’ +b’ Px’ + f =0 
Let b’P = | - . The last equation can now be written as 


MO’)? HMO +d x + ely’ + f =0 


388 Chapter 6 Inner Product Spaces 


This equation gives the conic section C in standard position in the x’y’ coordinate 
system. The type of conic section depends on the eigenvalues. Specifically, C is 

1. An ellipse if X; and ^2 have the same sign 

2. A hyperbola if X; and ^2 have opposite signs 

3. A parabola if either 4; or ^2 is zero 


| EXAMPLE1 | Let C be the conic section whose equation is x? — xy + y? — 8 = 0. 


a. Transform the equation to x’y’ coordinates so that C is in standard position 
with no x’y’ term. 


b. Find the angle of rotation between the standard coordinate axes and the x’y’ 
coordinate system. 


Solution a. The matrix form of this equation is given by 


Se 
x'Ax—8=0 with =| : z | 
=? 


The eigenvalues of A are ^4 = 5 and ^2 = 3, with corresponding (unit) eigen- 


vectors 
1N 3 E 
“=al a 


Then the orthogonal matrix 


1 = 
pa 1 1 
Wei i 
diagonalizes A. Moreover, since det(P) = 1, then by Theorem 15, the coordi- 
nate transformation is a rotation. Making the substitution x = Px’ in the matrix 
equation above gives 
(xy P'APx’ -8=0 
that is, 
(x’)’ Dx’ —-8 =0 where D= | 


O NIi= 
NIW © 
a | 


This last equation can now be written as 


1 $ 
5 0 x 
wala 3 ly ae 
so that the standard form for the equation of the ellipse in the x’y’ coordinate 
system is 
C O 
16 16 


6.7 Application: Quadratic Forms 389 


This is the equation of an ellipse with x’ as the major axis and y’ as the minor 
axis, as shown in Fig. 2. 


@ +30’? = 16 
Figure 2 
b. To find the angle between the original axes and the x’y’ coordinate system, 
observe that the eigenvector vı points in the direction of the x’ axis. Now 
using Definition 3 of Sec. 6.1, the cosine of the angle between e; and vı is 
given by 


ei ° Vi 1 T 
o= —— — TR E 
leill ivil v2 4 


An alternative way to find the angle between the axes is to note that the matrix 
P, which is the transition matrix from x’y’ coordinates to xy coordinates, can 
be written as 


1 1 : 

_| V2 AZ| | coso m sne h I 

E a E ia | Mee 
A A 


Example 2 involves a rotation and a translation. 


| EXAMPLE 2 | Describe the conic section C whose equation is 


2x? — 4xy — y’ — 4x = 8y + 14 = 0 


Solution The equation for C has the form x‘ Ax + b’x + f = 0 given by 


kil 3 i || 5 Jes -sl 5 | +14=0 


2 


The eigenvalues of A = E; E | are ^1 = —2 and ^2 = 3, with corresponding 


(unit) eigenvectors 


s[i] = eal 


390 


Chapter 6 Inner Product Spaces 


Since the eigenvalues have opposite sign, the conic section C is a hyperbola. To 
describe the hyperbola, we first diagonalize A. Using the unit eigenvectors, the 
orthogonal matrix that diagonalizes A is 


_ 1 eee Dol z 


Making the substitution x = Px’ in the equation x‘ Ax + b'x + f = 0 gives 


1 
rani —2 0 x’ We Xi a 
wil 0 SEs] + =a] 3 [z ]+s=0 


After simplification of this equation we obtain 


—2(x')? — 4/5x! +30)? +14 = 0 


—2 


alo 


that is, 
—2[(x')* +250] + 3(y')? + 14 =0 


After completing the square on x’, we obtain 
—2[(x')? + 2V5) +5] 3G = -14 — 10 


that is, 
(x +5) o 
12 8 
This last equation describes a hyperbola with x’ as the major axis. An additional 
transformation translating the x’ axis allows us to simplify the result even further. 
If we let 


= ll 


x"=x'+V5 ad y"=y’ 
then the equation now becomes 


(x)? Gi er 


12 8 
The graph is shown in Fig. 3. 


=v 


Figure 3 


Solution 


Figure 4 


6.7 Application: Quadratic Forms 391 


Quadric Surfaces 
The graph of a quadratic equation in three variables of the form 
ax? + bxy + cxz + dy? + eyz + fz? +gx+hy+iz+j=0 


is an ellipsoid, a hyperboloid, a paraboloid, or a cone. As in the two-dimensional case, 
the terms gx, hy, and iz produce translations from standard form, while the mixed 
terms xy, xz, and yz produce rotations. The quadratic form 


ax? + bxy + cxz +dy? +eyz + fz? 


can be written in matrix form as 


b 
a 3 5 x 
wAx=[xyz]| 8 d § y 
Z 
5 5 f 


As before, a rotation developed from the eigenvectors of A can be used to transform 
the quadric surface to one in standard form 


MX’)? + oly’)? +3’ +7 =0 


where 21, A2, and ^; are the eigenvalues of A. We omit the details. 


Write the quadratic equation 
5x? + 4y? — 52? + 8xz = 36 


in standard form by eliminating the xz term. 


Let 
50 4 
A=|0 4 0 
4 = 
Then the quadratic equation can be written as 
50 4 x 
[Ixyz]} 0 4 0 y | = 36 
4 0 -5 & 


The eigenvalues of the matrix A are 
M=V4l Msa 3 =4 
Hence, the quadric surface, in standard position, has the equation 
VAL? — VAIO +40’)? = 36 


The graph of the surface, which is a hyperboloid of one sheet, is shown in Fig. 4. 


392 Chapter 6 Inner Product Spaces 


Exercise Set 6.7 


In Exercises 1—6, let C denote the conic section given a. Write the quadratic equation in matrix form. 
by the equation. Transform the equation to x’y’ b. Find the quadratic equation that describes the 
coordinates so that C is in standard position with no conic C rotated by —30°. 
x'y’ term. 
1. 27x? — 18xy + 3y° +x +3y =0 9. Let C denote the conic section in standard 
2. 2x2 — 8xy + 8y? +2x +y =0 position given by the equation 16x? + 4y? = 16. 
a. Find the quadratic equation for the conic 
3. 12x? + 8xy + 12y? -8 = 0 section aned by Diii C by 60°. 
4. 11x? — 6xy + 19y? +2x +4y —12=0 b. Find the quadratic equation that describes the 
5, —x? — 6xy— y2 +8 =0 conic found in part (a) after a translation 
3 units to the right and 2 units upward. 
6. xy=1 
7. Let C denote the conic section in standard 10. Let C denote the conic section in standard 
position given by the equation 4x* + 16y? = 16. position given by the equation x? — y = 0. 
a. Write the quadratic equation in matrix form. a. Find the quadratic equation for the conic 
b. Find the quadratic equation that describes the section obtained by rotating C by 30°. 
conic C rotated by 45°. b. Find the quadratic equation that describes the 


conic found in part (a) after a translation 2 


8. Let C denote the conic section in standard : f : 
units to the right and 1 unit downward. 


position given by the equation x? — y? = 1. 
L 6.8 > Application: Singular Value Decomposition 


In earlier sections we have examined various ways to write a given matrix as a product 
of other matrices with special properties. For example, with the LU factorization of 
Sec. 1.7, we saw that an m x n matrix A could be written as A = LU with L being an 
invertible lower triangular matrix and U an upper triangular matrix. Also in Sec. 1.7, 
we showed that if A is invertible, then it could be written as the product of elementary 
matrices. In Sec. 5.2 it was shown that an n x n matrix A with n linearly independent 
eigenvectors can be written as 

A= PDP" 
where D is a diagonal matrix of eigenvalues of A. As a special case, if A is symmetric, 
then A has the factorization 

A= QDQ' 
where Q is an orthogonal matrix. 

In this section we consider a generalization of this last result for m x n matrices. 

Specifically, we introduce the singular value decomposition, abbreviated as SVD, 
which enables us to write any m x n matrix as 


A=UXV' 


where U is an m x m orthogonal matrix, V is an n x n orthogonal matrix, and © is an 
m x n matrix with numbers, called singular values, on its diagonal and zeros elsewhere. 


DEFINITION 1 


Solution 


6.8 Application: Singular Value Decomposition 393 


Singular Values of an m x n Matrix 


To define the singular values of an m x n matrix A, we consider the matrix A'A. 
Observe that since A is an m x n matrix, A’ is an n x m matrix, so the product A*A 
is a square n x n matrix. This new matrix is symmetric since (A'A) = AA" = A'A. 
Hence, by Theorem 14 of Sec. 6.6, there is an orthogonal matrix P such that 


P'(A'A)P =D 


where D is a diagonal matrix of the eigenvalues of A A given by 


hic 0 ane 0 

O do 0 
D= 

Os see: Sek Fy 


Since by Exercise 39 of Sec. 6.3 the matrix A'A is positive semidefinite, we also 
have, by Exercise 41 of Sec. 6.3, that ; > 0 for 1 < i < n. This permits us to make 
the following definition. 


Singular Values Let A be an m x n matrix. The singular values of A, denoted 
by o; for 1 <i < n, are the positive square roots of the eigenvalues ^1, ..., ^n of 
A'A. That is, 

oj = Vii for l<i<n 


It is customary to write the singular values of A in decreasing order 
012Z 022Z +++ 2 On 


As mentioned in Sec. 5.2, this can be accomplished by permuting the columns of the 
diagonalizing matrix P. 


Let A be the matrix given by 


> 

ll 
= O . 
O = m. 


Find the singular values of A. 


The singular values of A are found by first computing the eigenvalues of the square 


matrix 
1 01 li Pil 
1 1 0 1 0 le 


394 


Chapter 6 Inner Product Spaces 


THEOREM 16 


THEOREM 17 


The characteristic equation, in this case, is given by 
det(A'A — AI) = (A-3)(K—-1) =0 


The eigenvalues of A*A are then 4; = 3 and ^z = 1, so that the singular values 
are o] = J/3 and o2 =l. 


We have already seen that orthogonal bases are desirable and that the Gram- 
Schmidt process can be used to construct an orthogonal basis from any basis of an inner 
product space. In Sec. 6.4 we introduced the Four Fundamental Subspaces of an m x n 
matrix A and noted that the linear system Ax = b can be analyzed from the geometric 
relationships between these subspaces. In this section we develop a method to construct 
orthonormal bases for the Four Fundamental Subspaces of A. To do so, we use the 
symmetric matrix A‘ A, which by Theorem 14 of Sec. 6.6 is diagonalizable. Recall from 
Theorem 2, Sec. 5.2, that the eigenvectors of A’ A forma basis for R”. In Theorem 17 we 
show that the eigenvectors of A‘ A corresponding to nonzero eigenvalues form a basis 
for the column space of A. We first require a preliminary result given in Theorem 16. 


Let A be an m x n matrix and let B = {v,, v2,..., Vn} be an orthonormal basis 
of R” consisting of eigenvectors of A‘ A, with corresponding eigenvalues ^4, 
ho,..., Ay. Then 


1. || Av; || = o; for each i = 1,2,...,n. 
2. Av; is orthogonal to Av; fori Æ j. 


Proof For the first statement recall from Sec. 6.6 that the length of a vector v in 
Euclidean space can be given by the matrix product || v || = /v‘v. Therefore, 


|| Av; |? = (Avi)! (Avi) = vi (A AV; = VG; = di livi II? = Ni 


The last equality is due to the fact that v; is a unit vector. Part | is established 
by noting that o; = ~i; = || Av; ||. For part 2 of the theorem, we know that (as in 
Sec. 6.6) the dot product of two vectors u and v in Euclidean space can be given 
by the matrix product u-v = u'v. Thus, since B is an orthonormal basis of R”, if 
i Æ j, then 


(Av;) - (Avj) = (Avi) (Avj) = vi (A A)vj = vih jvj = AjViVj = 0 


Let A be an mxn matrix and B = {v], V2, . . . , Vn} an orthonormal basis of R” 
consisting of eigenvectors of A'A. Suppose that the corresponding eigenvalues 
satisfy M1 > Ao > e > Ar > Apt = + = Àn =O, that is, AA has r nonzero 
eigenvalues. Then B’ = {Avj, Av2, . . . , Av,} is an orthogonal basis for the column 


space of A and rank(A) =r. 


Solution 


6.8 Application: Singular Value Decomposition 395 


Proof First observe that since o; = ./; are all nonzero for 1 < i < r; then by 
Theorem 16, part 1, we have Avı, Av2,..., Av, are all nonzero vectors in col(A). 
By part 2 of Theorem 16, we have B’ = {Av1, Av2,..., AV,} is an orthogonal set 
of vectors in R”. Hence, by Theorem 5 of Sec. 6.2, B’ is linearly independent. Now 
to show that these vectors span the column space of A, let w be a vector in col(A). 
Thus, there exists a vector v in R” such that Av = w. Since B = {vj, V2, ..., Vn} 
is a basis for R”, there are scalars c1, c2,..., Cy such that 


V = C1V1 + €2V2 + +++ + CnVn 
Multiplying both sides of the last equation by A, we obtain 
AV = cj Av, + CAV + +++ + Cn AVn 


Now, using the fact that Av,4; = AvV;42 = <+- = Av, = 0, then 
AV = c1 AVi +C Av + +--+ + AV, 


so that w=Av is in span{Avı, Avo,...,Av,}. Consequently, B’ = 
{Av1, Av2,..., Av,} is an orthogonal basis for the column space of A, and 
the rank of A is equal to the number of its nonzero singular values. 


Let A be the matrix given by 
iil 
A=|0 1 
1 0 
Find the image of the unit circle under the linear transformation T: R? > R3 
defined by T(v) = Av. 
From Example 1, the eigenvalues of A'A are i; = 3 and ^2 = 1, with eigenvectors 
Y= 1/2 and v2 = —1/v2 
> | ae, 2 | yy 


respectively. The singular values of A are then oj = V3 and o2 = 1. Let C(t) be 
the unit circle given by cos(t)v; + sin(t)v2 for O < t < 2m. The image of C(t) 
under T is given by 


T (C(t)) = cos (t) Av; + sin (t) Avo 


By Theorem 17, B’ = {tavi tAv} is a basis for the range of T. Hence, the 


coordinates of T(C(t)) relative to B’ are x’=o,cost=V/3cost and 
y’ = o sint = sint. Observe that 


(5) : +0% = coe + Y? = cos? t + sin? t = 1 
J3 3 

which is an ellipse with the length of the semimajor axis equal to o; and length of 
the semiminor axis equal to o2, as shown in Fig. 1. 


396 


Chapter 6 Inner Product Spaces 


AY 


A Yy Multiplication by 


THEOREM 18 


Figure 1 


For certain matrices, some of the singular values may be zero. As an illustration, 


1 2 


consider the matrix A = | 3 6 


I For this matrix, we have col(A) = span | | 


The reduced row echelon form for A is the matrix a o , which has only one 


pivot column. Hence, the rank of A is equal to 1. The eigenvalues of A'A are ^; = 50 
and 2 = 0 with corresponding unit eigenvectors 


we [avs] = m= [ais | 


The singular values of A are given by o; = 5/2 and o2 = 0. Now, multiplying vı 
and v2 by A gives 


na) ae oe 


Observe that Av; spans the one dimensional column space of A. In this case, the 
linear transformation T: R? —+ R? defined by T(x) = Ax maps the unit circle to the 


line segment 
V5 
t -l<t<l 
{ | 3/5 = 


Singular Value Decomposition (SVD) 


as shown in Fig. 2. 


We now turn our attention to the problem of finding a singular value decomposition 
of an m x n matrix A. 


SVD Let A be an m xn matrix of rank r, with r nonzero singular values 
01, 02,...,0,. Then there exists an m x n matrix X, an m x m orthogonal matrix 


6.8 Application: Singular Value Decomposition 397 


A y AY 


t— —— > > 
2 x x 
e 
Figure 2 


U, and an n x n orthogonal matrix V such that 
A=UxXV' 


Proof Since A'A is ann x n symmetric matrix, by Theorem 14 of Sec. 6.6 there 
is an orthonormal basis {v,,..., Vn} of R”, consisting of eigenvectors of A*A. Now 
by Theorem 17, {Av;,..., Av,} is an orthogonal basis for col(A). Let {u;,..., u;} 
be the orthonormal basis for col(A), given by 


1 
u; = —— Ay; = — AV; for oh aes 
|| Av; || 0; 
Next, extend {u,;,...,u,} to the orthonormal basis {u,,..., Um} of R”. We can 
now define the orthogonal matrices V and U, using the vectors {vj,...,V,} and 
{u,,..., Um}, respectively, as column vectors, so that 
V=[v Vo +: Vn J and U=[uy Ww >> Uy | 
Moreover, since Av; = 0;u;, fori = 1,...,7r, then 
AV=| Avy; > Av, 0 --- O | =] oumu => ou 0 --- 0 


Now let & be the m x n matrix given by 


EB Oo ses O00 scx o] 
0 o ... 0/0 ... 0 
E= 0 o | 0 
0 0 | 0 0 
0 0 | 0 0 


398 Chapter 6 Inner Product Spaces 


Solution 


Then 
UX=)] wy Ww -> uy |È 
=| owu >- ou, 0 -:-- 0 
SAV: 


Since V is orthogonal, then V! = V~!, and hence, A = UEV’. 


Find a singular value decomposition of the matrix 


=I 1 
A= || ll 1 
2 = 2 


A procedure for finding an SVD of A is included in the proof of Theorem 18. We 
present the solution as a sequence of steps. 
Step 1. Find the eigenvalues and corresponding orthonormal eigenvectors of A'A 
and define the matrix V. 
The eigenvalues of the matrix 
6 —6 
es 
mae 
in decreasing order are given by ^; = 12 and ^2 = 0. The corresponding orthonor- 
mal eigenvectors are 


[A] ee [0 


Since the column vectors of V are given by the orthonormal eigenvectors of A’A, 
the matrix V is given by 


vale! [fz VW 
~~ | Ta y2 
Step 2. Find the singular values of A and define the matrix X. 
The singular values of A are the square roots of the eigenvalues of A'A, so that 
6) =VJVu = 2V3 and 02 = Vd. = 0 
Since £ has the same dimensions as A, then » is 3 x 2. In this case, 
2/3 0 
y= 0 0 
0 0 


6.8 Application: Singular Value Decomposition 399 


Step 3. Define the matrix U. 
The matrix A has one nonzero singular value, so by Theorem 17 the rank of A is 1. 
Therefore, the first column of U is 


i 1/6 
u; = — Áv; = ING 
01 DNG 


Next we extend the set {u;} to an orthonormal basis for R? by adding to it the 


vectors 
2/5 —1/./30 
Ww = 0 and u = 5/30 
1//5 2//30 

so that 


1/V6 2//5 —1//30 
U= 1/ v6 0 5/4/30 
-2//6 1//5 2/430 


The singular value decomposition of A is then given by 


1/6 2/V/5  —1/730 2/3 0 
A=UXV' = |i que 5/30 0 0 ae ee 
=2//6 TNS 2/30 0 0 : j 


=i] 1 
= e 1 
2 =2 


In Example 3, the process of finding a singular value decomposition of A was 
complicated by the task of extending the set {u,,...,u,} to an orthogonal basis for 
R”. Alternatively, we can use A'A to find V and AA’ to find U. To see this, note 
that if A= UXV' is an SVD of A, then A‘ = VX'U’. After multiplying A on the 
left by its transpose, we obtain 


A'A = VXY'U'UEV' =VD\V' 


where D; is an n x n diagonal matrix with diagonal entries the eigenvalues of A'A. 
Hence, V is an orthogonal matrix that diagonalizes A'A. On the other hand, 


AA’ = UXV'VE'U = UD,U' 


where D% is an m x m diagonal matrix with diagonal entries the eigenvalues of AA’ 
and U is an orthogonal matrix that diagonalizes AA‘. Note that it can be shown 
that the matrices A’A and AA‘ have the same eigenvalues. Therefore, the nonzero 
diagonal entries of Dı and Dz are the same. The matrices U and V found using this 


400 Chapter 6 Inner Product Spaces 


procedure are not unique. We also note that changing the signs of the column vectors 
in U and V also produces orthogonal matrices that diagonalize AA’ and A'A. As a 
result, finding an SVD of A may require changing the signs of certain columns of U 
or V. 

In Example 4 we use this idea to find an SVD for a matrix. 


| «EXAMPLE 4 | Find a singular value decomposition of the matrix 


Solution First observe that 
1 3 1 1 10 —8 
t — = 
asa=] i Sie E. a 
By inspection we see that vj = Z | e | is a unit eigenvector of AA with 


5 i I || « aes 
corresponding eigenvalue i; = 18, and vz = + is a unit eigenvector of 


v2| 1 
A'A with corresponding eigenvalue ^2 = 2. Hence, 


mala i 


The singular values of A are o1 = 3./2 and = /2 so that 


To find U, we compute 


ae[s a] lt 3] [6 8] 


Observe that a unit eigenvector corresponding to >; = 18 is u; = | ; | and a unit 
eigenvector corresponding to ^2 = 2 is u2 = | 3 i Thus, 
0 1 
= 
A singular value decomposition of A is then given by 
V2 A al 
Azusvia|? ! 3v2 0 v2 2 | 1 
1 0 0 V2 alle lk 3 =3 
v2 V2 


N(A) 
Nal 


row(A) 


a 


Figure 3 


6.8 Application: Singular Value Decomposition 401 


The Four Fundamental Subspaces 


In this subsection we show how the matrices U and V, which give the singular value 
decomposition of A, provide orthonormal bases for the four fundamental subspaces 
of A, introduced in Sec. 6.4. To develop this idea, let A be an m x n matrix of rank 
r <n and B = {v1,..., Vn} be an orthonormal basis of eigenvectors of A*A with 
corresponding eigenvalues \; > M2 > +++ > Ay > Ape = ee = An = O. First, from 
the proof of Theorem 17 if o1, ---, 0, are the nonzero singular values of A, then 


C= f av, er av) = {u,...,u,} 
01 Or 

is a basis for col(A). Next, the remaining columns of U are defined by extending 
C’ to an orthonormal basis C = {u}, ... , U», U-41, ---, Um} for R”. We claim that 
C" = {u,41,..., Um} is an orthonormal basis for N(A‘). To see this, observe that each 
vector of C’ is orthogonal to each vector of C”. Hence, by Proposition 5 of Sec. 6.4 
and the fact that dim(R”) = m, we have span{u,+1,...,U»,} = col(A)+. By Theorem 
10, part 2, of Sec. 6.4, span{u,41,..., Un} = N (AŻ), so that C” = {u,41,..., Un} isa 
basis for N(A‘) as claimed. We now turn our attention to the matrix V. From the proof 
of Theorem 16, we have Av,4; = --- = Av, = 0. Consequently, span{v,+),..., Vn} 
is contained in N(A). Now by Theorem 5 of Sec. 4.2, 


dim(N(A)) + dim(col(A)) = n 


so that dim(N(A)) =n — r. Since B” = {v,41,..., Vn} is an orthogonal, and hence 
linearly independent, set of n —r vectors in N(A), by Theorem 12, part (1), of 
Sec. 3.3, B” is a basis for N(A). Finally, since B = {vj,..., Vn} is an orthonormal 
basis for R”, each vector of B” is orthogonal to every vector in B’ = {v1, ..., Vp}. 
Hence, 

span{v,,...,V,-} = N(A)*+ = col(A’) = row(A) 


so that B’ is a basis for row(A). 
To illustrate the ideas of this discussion, consider the matrix A of Example 3 and 
its SVD. By the above discussion, we have 


row(A) = span { | E | \ col(A) = span 1 


2 
(A) = span {| |} N(A‘) = span 0], 1 
1 


The four fundamental subspaces are shown in Fig. 3. 


Data Compression 


An important application that involves the singular value decomposition is data com- 
pression. As a preliminary step, suppose that a matrix A of rank r (with r nonzero 


402 


Chapter 6 Inner Product Spaces 


Figure 4 


singular values) has the SVD A = U£ V'. That is, 
A=UXV'=| ow = ow 0 --- OO} V’ 


t t t 
= 01U1 V] + O2U2V> + +e +0,;U,-V, 


1 i 1 ‘ 1 i 
=o, | — Avı | vi +02 | —Av2 | Vv, + --: +0, | — Av, | V, 
o1 02 Or 
= (Avı )vi + (Av2)v + +--+ + (Av,)vi. 


Observe that each of the terms Av;v} is a matrix of rank 1. Consequently, the sum 
of the first k terms of the last equation is a matrix of rank k <r, which gives an 
approximation to the matrix A. This factorization of a matrix has application in many 
areas. 

As an illustration of the utility of such an approximation, suppose that A is the 
356 x 500 matrix, where each entry is a numeric value for a pixel, of the gray scale 
image of the surface of Mars shown in Fig. 4. A simple algorithm using the method 
above for approximating the image stored in the matrix A is given by the following: 


1. Find the eigenvectors of the n x n symmetric matrix A'A. 
2. Compute Av;, fori =1,...,k, with k <r =rank(A). 


3. The matrix (Av; )vj + (Av2)v5 + --- + (Av,)v; is an approximation of the orig- 
inal image. 


To transmit the kth approximation of the image and reproduce it back on earth 
requires the eigenvectors V1, ..., Vg of A A and the vectors Avı, ..., Avx. 

The images in Fig. 5 are produced using matrices of ranks 1, 4, 10, 40, 80, and 
100, respectively. 


Figure 5 


The storage requirements for each of the images are given in Table 1. 


6.8 Application: Singular Value Decomposition 403 


Table 1 


Image Storage Requirement | Percent of Original 


Original 356 x 500 = 178,000 


Approximation 1 2 x 500 = 1,000 
Approximation 2 8 x 500 = 4, 000 


Approximation 3 20 x 500 = 10,000 


Approximation 4 80 x 500 = 40, 000 
Approximation 5 160 x 500 = 80, 000 


Approximation 6 | 200 x 500 = 100,000 


Exercise Set 6.8 


In Exercises 1—4, find the singular values for the =2 i =] 
matrix. 8. A= 0 1 1 


| =2 -2 | In Exercises 9 and 10, the condition number of a 
1 1 matrix A is the ratio o,/o,, of the largest to the 
smallest singular value. The condition number 
2 A= | Sle 2 | provides a measure of the sensitivity of the linear 
2 system Ax = b to perturbations to A or b. A linear 
| system is ill-conditioned when the condition number 


1 0 2 is too large and called singular when the condition 
3. A= 2 -1 -1 number is infinite (the matrix is not invertible). 
—2 1 
9. Let A= od 
1 1 0 eet “= 1 1,000000001 | 
4. A= 0 oO 1 a. Solve the linear system 
-l 1 0 2 
ml 
In Exercises 5—8, find a singular value decomposition 2 


for the matrix. i 
b. Solve the linear system 


5 3 2 
A= = 
i | 3 5 | = | 2.000000001 | 
2 
1 


| c. Find the condition number for A. 
0 
1 


1 


1 0 10. Let b= 3 
akora] ~ 


404 Chapter 6 Inner Product Spaces 


a. Let 
—2 -1 0 
A=|-—2 -l1 -2 
0 -—2 1 


Solve the linear system Ax = b. 


b. Let 
—2.00001 —1.001 0 
B= | -2.01 —0.87 -2 
0 —2 1 


Solve the linear system Bx = b. 
c. Find the condition number for A. 


Review Exercises for Chapter 6 


1. Let V be the inner product space R? with the 
standard inner product and let 


1 1 2 
B= OJ, OJ, |1 
1 0 0 


a. Verify that B is a basis for R°. 
b. Use B to find an orthonormal basis for RÌ. 


1 1 
c. Let W = span O |,| 0 and 
1 0 
—2 
v= 1 |. Find projy v. (Hint: First use 
-1 


the Gram-Schmidt process to find an 
orthogonal basis for W; then refer to Exercise 
16 of Sec. 6.4.) 


2. Let 
—1 —3 
2 0 
W = span 2 l> ol 
—2 0 
3 0 
—2 0 
-1 [°| -1 
1 


be a subspace of R* with the standard inner 
product. 


a. Find a basis for W. 
b. Find WŁ. 
c. Find an orthonormal basis for W. 


d. Find an orthonormal basis for W+. 
e. Verify that dim(R*) = dim(W) + dim(W +). 
—2 


f. Find the orthogonal projection of v = 


onto W. 


. Let a, b, and c be real numbers and 


x 
W= y | eR] ax+by+cz=0 
z 


where R? is given the standard inner product. 


a 
a. Show that | b | isin W+. 
c 
b. Describe WŁ. 
x 
c. Let v= | x2 |. Find projyiv. 
x3 


d. Find || projy v || - 


. Define on Pz an inner product by 


1 
(p.q) = f p(x)q(x)dx 


Let p(x) = x and q(x) = x? — x + 1. 
. Find (p,q). 
. Find the distance between p and q. 
. Are p and q orthogonal? Explain. 


. Find proj, P. 


a 
b 
c 
d. Find the cosine of the angle between p and q. 
e 
f. Let W = span{ p}. Find W+. 


5. Let V be the inner product space C©[—n, x] 
with inner product defined by 


(f8) = f (x)g(x) dx 


Let W = span{1, cos x, sin x}. 
a. Verify that the set {1, cos x, sin x} is 
orthogonal. 
b. Find an orthonormal basis for W. 
c. Find proj yw x 
d. Find || projy x° ||. 
6. Let B = {vj,.. 
an inner product space V, and let v be a vector 
in V. 


Cl 
a. Find the coordinate of v relative to B. 
Cn 
b. Show that civ; = projy,v for each 
i=1,2,...,n. 
c. 
1 1 1 
V2 > y2 > /6 
0 0 —2 


be an orthonormal basis for R?, with the 
standard inner product, and let 


1 1 
AEA 
1 ! . Find the coordinate 


a) 
pay 
c2 


| ; | of v relative to B. 
Cn 


7. Show that if B is an orthonormal basis for R”, 
with the standard inner product, and 
c1 
[v] = , then 
Cn 
Ivl = yita He 


Give a similar formula for ||v || if B is an 
orthogonal basis, not necessarily orthonormal. 


., Vn} be an orthonormal basis for 


8. 


10. 


6.8 Application: Singular Value Decomposition 405 


Let {v],..., Vm} be an orthonormal subset of R”, 
with the standard inner product, and let v be any 
vector in R”. Show that 


m 
2 2 
Ivi? = Sow-vi) 
i=1 


(Hint: Expand || v — S77", (v, vi) Vi II") 


. (QR factorization) Let A be an m x n matrix 


with linearly independent column vectors. In this 
exercise we will describe a process to write 

A = QR, where Q is an m x n matrix whose 
column vectors form an orthonormal basis for 
col(A) and R is ann x n upper triangular matrix 
that is invertible. Let 


a. Let B = {v 1, V2, v3} be the set of column 
vectors of the matrix A. Verify that B is 
linearly independent and hence forms a basis 
for col(A). 

b. Use the Gram-Schmidt process on B to find an 
orthogonal basis By = {w1, W2, W3}. 

c. Use B; to find an orthonormal basis 
B2 = {q1, q2, q3}. 

d. Define the matrix Q = [q1 q2 q3]. Define the 
upper triangle matrix R for i = 1, 2,3 by 


e. Verify that A = QR. 


Let B = {v1, V2,..., Vn} be an orthogonal basis 
for an inner product space V and cj, ¢2,..., Cy 
arbitrary nonzero scalars. Show that 

By = {c1V1, C2V2, . - - , CnVn} 


is an orthogonal basis for V. How can the scalars 
be chosen so that Bı is an orthonormal basis? 


406 Chapter 6 Inner Product Spaces 


Chapter 6: Chapter Test 


In Exercises 1—40, determine whether the statement is 
true or false. 


1. If u is orthogonal to both vı and v2, then u is 
orthogonal to span{v1, v2}. 


2. If W is a subspace of an inner product space V 
and v € V, then v—projyv € Wt. 


3. If W is a subspace of an inner product space V, 
then W N W+ contains a nonzero vector. 


4. Not every orthogonal set in an inner product 
space is linearly independent. 


In Exercises 5—10, let 
—2 
_ _ 1 
y= V = 2 
1 
be vectors in R4 with inner product the standard dot 
product. 
5. || vz || = 30 


6. The distance between the vectors vı and v2 is 


2V 14. 
as E T i 
7. The vector u = 7 vı is a unit vector. 
8. The vectors vı and v2 are orthogonal. 


9. The cosine of the angle between the vectors 
vı and v? is -4vVv10. 


| | 

; —4/15 

10. proj,, v2 = E 
—12/15 


In Exercises 11—16, let 


1 —1 
v = V = 1 
1 1 

2 

V3 = 4 

—2 


be vectors in R? with inner product the standard dot 
product. 


11. 
12. 


13. 


14. 
15. 
16. 


The set {v1, V2, V3} is orthogonal. 


The set {v1, V2, v3} is a basis for R. 


1 

If W = span{v;, vo} andu= | 1 |, then 
1 

projy u = vı + tvz. 


If W = span{v;, v2, v3}, then Wt = {0}. 
If W = span{v;, v2}, then W+ = {0}. 


If W = span{v;, Vo, v3}, then projyv = v for any 
vector v € R. 


In Exercises 17—23, use the inner product defined on 
P2 defined by 


24. 


25. 


26. 


1 
(Pg) = I P(x)q(x) dx 


4 


[= -xl = 73 


. The polynomials p(x) = x and q(x) = x? — 1 are 


orthogonal. 


. The polynomials p(x) = 1 and q(x) = x? — 1 are 


orthogonal. 


. The set {1, x, x? — 1) is orthogonal. 

. The vector p(x) = 5 is a unit vector. 

. If W = span{1, x}, then dim(W*) = 1. 
. If W = span{1, x7}, then a basis for W+ 


is {1, x}. 


An n x n symmetric matrix has n distinct real 
eigenvalues. 


If u and v are vectors in R?, then 
(u, v) = 3u,v2 — uzv; defines an inner product. 


For any inner product 


(2u, 2v + 2w) = 2 (u, v) + 2 (u, w) 


27. 


28. 


29. 


30. 


31. 


32. 


33. 


If W = span{1, x?} is a subspace of P2 with 
inner product 


1 
(p.q) = p(x)q(x)dx 


then a basis for W+ is {x}. 


If {u,,..., ug} is a basis for a subspace W of an 
inner product space V and {vj,..., Vm} is a basis 
for W+, then {u,,..., Ux, V1, ..., Vm} is a basis 
for V. 


If A is ann x n matrix whose column vectors 
form an orthogonal set in R” with the standard 
inner product, then col(A) = R”. 


In R? with the standard inner product, the 
orthogonal complement of y = 2x is y = 5x. 


In R? with the standard inner product, the 
orthogonal complement of —3x + 3z = 0 is 
—3 
span 0 
3 


Every finite dimensional inner product space has 
an orthonormal basis. 


If 
1 0 
W = span 2° 1% 1 
1 —1 


then a basis for W+ is also a basis for the null 
space of 


34. 


35. 


36. 


37. 


38. 


39. 


40. 


407 


6.8 Application: Singular Value Decomposition 


If 
1 = 
W = span 0 |, 1 
1 0 
then dim(W+) = 2. 
If 
1 
W = span 1{/,} 0 
1 
then 
-1 
WŁ = span —1 
1 
In R with the standard inner product there exists 


a subspace W such that dim(W) = dim(W*+). 


If A is an n x n matrix whose column vectors are 
orthonormal, then AA‘v is the orthogonal 
projection of v onto col(A). 


If u and v are orthogonal, w; is a unit vector in 
the direction of u, and w2 is a unit vector in the 
opposite direction of v, then w; and w2 are 
orthogonal. 


If u and v are vectors in R” and the vector 
projection of u onto v is equal to the vector 
projection of v onto u, then u and v are linearly 
independent. 


If A is an m x n matrix, then AA‘ and A'A have 
the same rank. 


APPENDIX 


Preliminaries 


| = A.1 > Algebra of Sets 


The notion of a set is a fundamental concept in mathematics allowing for the grouping 
and analysis of objects with common attributes. For example, we can consider the 
collection of all even numbers, or the collection of all polynomials of degree 3. A set 
is any well-defined collection of objects. By this we mean that a clear process exists 
for deciding whether an object is contained in the set. The colors of the rainbow—red, 
yellow, green, blue, and purple—can be grouped in the set 


C = {red, yellow, green, blue, purple} 


The objects contained in a set are called members, or elements, of the set. To indicate 
that x is an element of a set S, we write x € S. Since green is one of the colors of the 
rainbow, we have that green € C. The color orange, however, is not one of the colors 
of the rainbow and therefore is not an element of C. In this case we write orange ¢ C. 

There are several ways to write a set. If the number of elements is finite and 
small, then all the elements can be listed, as we did with the set C, separated by 
commas and enclosed in braces. Another example is 


S = {—3, —2, 0, 1,4, 7} 


If a pattern exists among its elements, a set can be described by specifying only a 
few of them. For example, 
S = {2,4,6,...,36} 


is the set of all even numbers between 2 and 36, inclusive. The set of all even whole 
numbers can be written 
T ={2,4, 6,.. 33 


Special sets of numbers are often given special symbols. Several common ones 
are described here. The set of natural numbers, denoted by N, is the set 


N = {1,2,3,...} 
The set of integers, denoted by Z, is given by 
Z={...,—3,—2,-1,0,1,2,3,...} 


409 


410 


Appendix A Preliminaries 


We use the symbol @ to denote the set of rational numbers, which can be described as 


a-{2 p.gezaxo} 


Finally, the set of real numbers, denoted by R, consists of all rational and irrational 
numbers. Examples of irrational numbers are 2 and 1. 
In many cases, the set we wish to consider is taken from a larger one. For example, 


S={xeR|-l1 <x <4} 


is the set of all real numbers greater than or equal to —1 and less than 4. In general, 
the notation 
{x € L | restriction on x} 

translates to “the set of all x in L such that x satisfies the restriction.” In some cases 
L is omitted if a universal set is implied or understood. 

Sets can be compared using the notion of containment. Denote two sets by A and 
B. The set A is contained in B if each element of A is also in B. When this happens, 
we say that A is a subset of B and write A C B. For example, let 


A = {1, 2} B = {1,2,3} and C = {2,3,4} 


Since every element of A is also in B, we have A C B. However, A is not a subset 
of C since 1 € A but 1 ¢ C. In this case we write A É C. For the sets of natural 
numbers, integers, rational numbers, and real numbers, we have 


NCZCOCR 


The set with no elements is called the empty set, or null set, and is denoted by 9. 
One special property of the empty set ọ is that it is a subset of every set. 

Two sets A and B are equal if they have the same elements. Alternatively, A and 
B are equal if A C B and B C A. In this case we write A = B. 


Operations on Sets 


Elements can be extracted from several sets and placed in one set by using the oper- 
ations of intersection and union. The intersection of two sets A and B, denoted by 
A N B, is the set of all elements that are in both A and B, that is, 


ANB={x|xeAandxe B} 


The union of two sets A and B, denoted by A U B, is the set of all elements that are 
in A or B, that is, 
AUB={x|xeEAorxe B} 


As an illustration, let A = {1, 3, 5} and B = {1, 2, 4}. Then 
ANB={l} and AUB = {l,2,3,4, 5} 


A graphical device, called a Venn diagram, is helpful for visualizing set operations. 
The Venn diagrams for the intersection and union of two sets are shown in Fig. 1. 


A.1 Algebra of Sets 411 


A B A B 
ANB AUB 
Figure 1 


| EXAMPLE1 | Define two intervals of real numbers by A = [—3, 2) and B = [—7, 1). Find AN B 


Solution 


and A U B. 


Since the intervals overlap with —7 < —3 < 1 < 2, the intersection is the interval 
AN B =[-3, 1) 


and the union is the interval 
AUB = [-7, 2) 


Notice that x ¢ A N B if and only if x ¢ A or x ¢ B, and x ¢ AU B if and only 
if x ¢ A and x ¢ B. 

The complement of the set A relative to the set B, denoted by B\ A, consists of all 
elements of B that are not elements of A. In set notation this complement is given by 


B\A={xeB]|x¢A} 
For example, let A and B be the intervals given by A = [1, 2] and B = [0, 5]. Then 
B\A = [0, 1) U (2, 5] 


If A is taken from a known universal set, then the complement of A is denoted 
by A°. To illustrate, let A = [1,2] as before. Then the complement of A relative to 
the set of real numbers is 


R\A = A = (—00, 1) U (2, 00) 


Another operation on sets is the Cartesian product. Specifically, the Cartesian 
product of two sets A and B, denoted by A x B, is the set of all ordered pairs whose 
first component comes from A and whose second component comes from B. So 


Ax B={(x,y)|x € Aand ye B} 
For example, if A = {1,2} and B = {10, 20}, then 
A x B = {(1, 10), (1, 20), (2, 10), (2, 20)} 


This last set is a subset of the Euclidean plane, which can be written as the Cartesian 
product of R with itself, so that 


R? =RxR={(x,y) | x,y € R} 


412 Appendix A Preliminaries 


| EXAMPLE2 | Let A = [-3, 2) and B = (—2, 1]. Describe the set A x B. 


T Solution Since A x B consists of all ordered pairs whose first component comes from A and 
second from B, we have 
Ha =o — < 
Ti eee Ea 3<x<2 and 2) S Il 
i N The points that satisfy these two conditions lie in the rectangular region shown in 
Da Fig. 2. 
-s| 
Figure 2 Example 3 shows that operations on sets can be combined to produce results 


similar to the arithmetic properties of real numbers. 


| EXAMPLE3 | Verify that if A, B, and C are sets, then A N (B U C) = (AN B) U (A N C). 


Solution The Venn diagrams in Fig. 3 show that although the two sets are computed in 
different ways, the result is the same. The quantities inside the parentheses are 
carried out first. Of course, the picture alone does not constitute a proof. To establish 
the fact, we must show that the set on the left-hand side of the equation above is 
a subset of the set on the right, and vice versa. 


A A 


AN(BUC) (AQAB)U(ANC) 
Figure 3 


Indeed, if x € AN (B U C), then x € A and x € BUC. This is equivalent to 
the statement x € A and (x € B or x € C), which in turn is also equivalent to 


(x € A and x € B) or (x € A and x € C) 
Hence, x € (AM B) U (A N C), and we have shown that 
AN(BUC) CE (AN B)U(ANC) 
On the other hand, let x € (AN B) U(ANC), which can also be written as 
x € (ANB) or x € (ANC) 


This gives 
xE€A and xeEB or xE€A and xEC 


In either case, x € A and, in addition, x € B or x € C, so that 


xe AN(BUC) 


THEOREM 1 


THEOREM 2 


A.1 Algebra of Sets 413 


Therefore, 
(AN B)U(ANC) CAN(BUC) 


Since each set is a subset of the other, we have 


AN(BUC) = (AN B)U(ANC) 


Theorem | includes the result given in Example 3 along with other properties of 
set operations. The verifications of the remaining properties are left as exercises. 


Let A, B, and C be sets contained in a universal set U. 
1.ANA=A,AUA=A 

(A) =A 

ANAS =o, AUAS=U 
B=BNA,AUB=BUA 
(AN B)NC=AN(BNC), (AUB)UC=AU(BUC) 
AN(BUC)=(ANB)U(ANC) 
AU(BNC)=(AUB)N(AUC) 


AW PR YwWDN 
> 
) 


DeMorgan’s Laws Let A, B, and C be sets. Then 
1. A\(B U C) = (A\B) A (A\C) 
2. A\(B N C) = (A\B) U (A\C) 
Proof (1) We need to verify that the set on the left-hand side of the equation is a 


subset of the set on the right, and vice versa. We begin by letting x € A\(B U C). 
This means that x € A and x ¢ B U C. This is equivalent to the statement 


xEA and (x ¢ B and x ¢ C) 
which is then equivalent to 
xEA and x¢B and xe€A and x€¢C 
This last pair of statements gives 
x € (A\B)N(A\C) so that A\(B UC) C (A\B)N (A\C) 


To show containment in the other direction, we let x € (A\ B) N (A\C). Rewriting 
this in equivalent forms, we have 
x € (A\B) and x € (A\C) 
x €A and xB and x€A and x€C 
xeA and x€B and x€C 
xeA and x ¢ (BUC) 


414 Appendix A Preliminaries 
Therefore, 


(A\B) N(A\C) € A\(BUC) 
(2) The proof is similar to the one given in part | and is left as an exercise. 


Exercise Set A.1 i 


In Exercises 1—6, let the universal set be Z and let 17.CxB 
A = {—4, —2, 0, 1, 2,3, 5,7, 9} 18. (A x B)\[C x (BN C)] 
B = {—3, —2, —1, 2, 4, 6, 8, 9, 10} 19. A x (BNC) 
Compute the set. 20. (A x B)N(A x C) 
1. ANB In Exercises 21—26, let 
2. AUB A = {1,2,3, 5,7,9, 11} 
3. AxB B = {2, 5, 10, 14, 20} 
C = {1, 5,7, 14, 30, 37} 
4. (AU B) ; 
Verify that the statement holds. 
5. A\B 
21. (ANB)NC=AN(BNC) 
6. B\A 
; 22. (AUB)UC=AU(BUC) 
In Exercises 7—14, use the sets 


23. AN(BUC) = (AN B)U(ANC) 
24. AU(BNC) =(AUB)N(AUC) 
25. A\(B U C) = (A\B)N (A\C) 


A=(-11,3] B=[0,8] C=[-9,o) 


Compute the set. 


ote 26. A\(B N C) = (A\B) U(A\C) 
8. (AU B)? In Exercises 27—34, show that the statement holds for 
9. A\B all sets A, B, and C. 
10. C\A 27. (A9) = A 
11. A\C 28. The set A U A“ is the universal set. 
12. (AU B) NAC 29 ANB=BNA 
13. (A U B)\C 30. AUB=BUA 
14. B\(ANC) 31. (ANB)NC=AN(BNC) 
In Exercises 15—20, use the sets 32. (AUB)UC=AU(BUC) 
A = (-2,3] B =[1,4] C = [0,2] 33. AU(BAC)=(AUB)A(AUC) 
to sketch the specified set in the plane. 34. A\(B N C) = (A\B) U (A\C) 
15. AxB 35. If A and B are sets, show that 


16. Bx C A\B = AN BS 


A.2 Functions 415 


36. If A and B are sets, show that 39. If A, B, and C are sets, show that 
(AU B)N AS =B\A Ax (BNC)=(Ax B)N(AxC) 
37. If A and B are sets, show that 40. The symmetric difference operation A on two sets 
(A U B)\(AN B) = (A\B) U (B\A) A and B is defined by 
AAB = (A\B) U (B\A) 
38. If A and B are sets, show that Show that 
(AN B) = A\(A\B) AAB = (AU B)\(AN B) 


A.2 > Functions 


DEFINITION 1 


The sets we described in Sec. A.1 along with functions are two of the fundamental 
objects of modern mathematics. Sets act as nouns defining objects and functions as 
verbs describing actions to be performed on the elements of a set. Functions connect 
each element of one set to a unique element of another set. The functions that are 
studied in calculus are defined on sets of real numbers. Other branches of mathematics 
require functions that are defined on other types of sets. The following definition is 
general enough for a wide variety of abstract settings. 


Function A function f from a set X to a set Y is a rule of correspondence 
that associates with each element of X exactly one element of Y. 


Before continuing with a description of functions, we note that there are other 
ways of associating the elements of two sets. A relation is a rule of correspondence 
that does not (necessarily) assign a unique element of Y for each element of X. 
A function, then, is a relation that is well defined with a clear procedure that associates 
a unique element of Y with each element of X. A common metaphor for a function 
is a machine that produces a unique output for each input. 

A function f is also called a mapping from X to Y and is written f: X — Y. 
If x € X is associated with y € Y via the function f, then we call y the image of x 
under f and write y = f(x). The set X is called the domain of f and is denoted by 
dom(/). The range of f, denoted by range( f), is the set of all images of f. That is, 


range(f) = {f Œ) | x € dom(f)} 
If A is a subset of the domain, then the image of A is defined by 
F(A) = {f) | x € A} 


Using this notation, we have range(f) = fX). 

There are many ways of describing functions. The pictures shown in Fig. 1, give 
us one way while providing an illustration of the key idea distinguishing relations 
from functions. 


416 Appendix A Preliminaries 


Figure 2 


Figure 3 


DE 
: 


A function Not a function 


Figure 1 


The relation f, shown in Fig. 1, is a function since it is well defined with each 
element of the set X corresponding to a unique element of the set Y. Notice that more 
than one element in the domain of the function f can be associated with the same 
element in the range. In this case x3 and x4 both map to y3. However, the relation 
g, also shown in Fig. 1, is not a function since x3 corresponds to both y3 and y4. 
Notice in this example that f(X) is not equal to Y, since y4 is not in the range of 
f. In general, for the mapping f: X —> Y, the set X is always the domain, but 
range( f) C Y. 

The graph of a function f: X — Y is a subset of the Cartesian product X x Y 
and is defined by 


graph( f) = {(x, y) | x € X and y = f(x) € range(f)} 
For a function f: R — R the graph is a subset of R?, the Cartesian plane. 
A familiar function is f: R —> R defined by the rule 


f(x) =x? -—4443=(x-2)-1 


Since the rule describing the function is defined for all real numbers, we have 
dom(f) = R. For the range, since the vertex of the parabola is (2,—1), then 
range( f) = [—1, 00). These sets are also evident from the graph of the function, 
as shown in Fig. 2. Also the image of x = 0 is f(0) = 3. Notice that in this example 
it is also the case that f(4) = 3, so {0, 4} is the set of all real numbers with image 
equal to 3. The set {0, 4} is called the inverse image of the set {3}. This motivates the 
next concept. 

If f: X — Y is a function and B C Y, then the inverse image of B, denoted 
by f =1 (B), is the set of all elements of the domain that are mapped to B. That is, 


fB) = {x EX | fx) € B} 
The set f—!(B) is also called the set of preimages of the set B. As another illustration 


let f: [0, 2m] —> [-1, 1] be defined by f(x) = sinx. The graph is shown in Fig. 3. 
We see from the graph that 


f-'((0,1))=[0,x] and =f" ({-1, 0) = [x, 2x] 


A.2 Functions 417 


| «EXAMPLE 1 | Let f: R —> R be the function defined by f(x) = x? — 4x + 3. Define the sets 


Solution 


ae & & 


a. 


A=[0,3] 8-04 o ero 


. Compare the sets f(A N B) and f(A) NA f (B). 
. Compare the sets f(A U B) and f(A) U f (B). 
. Compare the sets fA. N D) and fe) M f(D). 
. Compare the sets fe U D) and fO) U foie 


Since AM B = [1,3], we see from Fig. 2 that f(A N B) = [—1, 0]. Again 
using the graph of f, we have f(A) = f((0, 3]) =[—1,3] and f(B) = 
Ff, 4) = [-1, 3], so that f(A) N f(B) = [-1, 3]. Hence, we have shown 
that 


F(AN B)S f(A)N f(B) with f(ANB) F f(A)N f(B) 


. Since A U B = [0,4], we have f(A U B) = [-1, 3]. Also f(A) = [—1, 3] = 


f(B), so that f(A) U f(B) = [—1, 3]. Therefore, 
f(A U B)= f(A) U f(B) 


. Since C N D = [0, 3], we have 


f-\(C OD) = {x € R | f(x) € [0, 3} 
= {x € R| 0< fœ) = 3} 
= {xe R|0<(«—-2)?-1 <3} 


We see from Fig. 2 
fC A D) =e 


On the other hand, 
f (C) = [0,4] 
The inverse image of the set D is 


f(D) = [0, 1] U [3, 4] 


Finally, 
F Ons D y eae 


. Since C U D = [—1, 3], we have from the results in part (c) 


f (CUD) = OU O) 


Theorem 3 summarizes several results about images of sets and inverse images 


of sets including the observations made in Example 1. 


418 Appendix A Preliminaries 


THEOREM 3 Let f: X —> Y be a function, and suppose A and B are subsets of X and C and 
D are subsets of Y. Then 


. F(ANB)S f(A)N f(B) 
. f(AUB) = f(A)U f(B) 
© FCA D) = f! (C) N f(D) 
© F(C U D) = f7! (C) U f(D) 
. AC FFA) 

6. f(f-'(C)) EC 
Proof (1) Let y € f(A N B). Then there is some x € A N B such that y = f(x). 
This means that y € f(A) and y € f (B), and hence y € f(A) AO f (B). Therefore, 
SIAN B) C f(A)N f(B). 
(3) To show that the sets are equal, we show that each set is a subset of the other. 
Letxef =I(C A D), so that f(x) € CN D, which is equivalent to the statement 
f(x) €C and f(x) € D. Therefore, x € fO) and x € f7'(D), and we have 
FCN D) € FC) A AD); 

Now let x € fO) N f-!(D), which is equivalent to the statement x € 
f-'(C) and x € f~!(D). Then f(x) € C and f(x) € D, so that f(x) E CAD. 
Therefore, x € fre N D) and hence fC) N f(D) C FC N D). 

(5) If x eA, then f(x)e f(A), and hence x€ FIFA). This gives 
AC FA). 


The proofs of parts 2, 4, and 6 are left as exercises. 


na bk WN = 


Example 1(a) provides a counterexample to show that the result in Theorem 3, 
part 1, cannot be replaced with equality. 


Inverse Functions 


An inverse function of a function f, when it exists, is a function that reverses the action 
of f. Observe that if g is an inverse function of f and f(a) = b, then g(b) =a. For 
example, if f(x) = 3x — 1 and g(x) = (x + 1)/3, then f(2) = 5 and g(5) = 2. One 
of the most important function-inverse pairs in mathematics and science is f(x) = e* 
and g(x) = lnx. 

For a function to have an inverse function, the inverse image for each ele- 
ment of the range of the function must be well defined. This is often not the case. 
For example, the function f: R — R defined by f(x) = x? cannot be reversed as a 
function since the inverse image of the set {4} is the set {—2, 2}. Notice that the inverse 
image of a set in the range of a function is always defined, but the function may not 
have an inverse function. A function that has an inverse is called invertible. Later 
in this section we show that if a function is invertible, then it has a unique inverse. 
This will justify the use of the definite article and the symbol f~! when referring to 
the inverse of the function f. Functions that have inverses are characterized by the 


DEFINITION 2 


THEOREM 4 


Solution 


A.2 Functions 419 


property called one-to-one. The function described in Fig. 1 is not one-to-one, since 
both x3 and x4 are sent to the same element of Y. This cannot occur for a one-to-one 
function. 


One-to-One Function Let f: X —> Y be a function. Then f is called one- 
to-one, or injective, if for all x; and x2 with xı Æ x2, then f(x) Æ f(x). 


Alternatively, f is one-to-one if whenever f(x) = f(x2), then xı = x2. For a 
function f: R —> R, this condition is met if every horizontal line intersects the graph 
of f in at most one point. When this happens, f passes the horizontal line test and 
is thus invertible. This test is similar to the vertical line test used to determine if 
f is a function. The inverse of f is denoted by f~! with f—!: range(f) — X. 
Theorem 4 gives a characterization of functions that are invertible. We omit the 
proof. 


Let X and Y be nonempty sets and f: X —> Y be a function. The function f has 
an inverse function if and only if f is one-to-one. 


As an illustration, let f: R —> R be defined by y = f(x) = 3x + 1. Since the 
graph, which is a straight line, satisfies the horizontal line test, the function is one- 
to-one and hence has an inverse function. To find the inverse in this case is an easy 
matter. We can solve for x in terms of y to obtain 

y-1 
x= 
3 
The inverse function is then written using the same independent variable, so that 
1 x—1 
Fos 
It is also possible to show that a function has an inverse even when it is difficult to 
find the inverse. 


Show the function f: R — R defined by f(x) = x? + x is invertible. 


By Theorem 4, to show that f is invertible, we show that f is one-to-one. Suppose 
that xı 4 x2 with xı < x2. We wish to show that f (x1) 4 f (x2). Since the cubing 
function is strictly increasing for all x, we have 

xı < x2 and xe Sexe 
Therefore, 


f@)=*7+%1<23+%2=f@2) sothat fa) # fo) 


420 


Appendix A Preliminaries 


DEFINITION 3 


The graph of an invertible function can be used to describe the graph of the 
inverse function. To see how, suppose that (a, b) is a point on the graph of y = f(x). 
Then b = f(a) and a= f~'(b). Consequently, the point (b,a) is on the graph of 
y= fix). Since the point (b, a) is the reflection of (a, b) through the line y = x, 
the graphs of f and f7! are also reflections through y = x. The graph of the function 
and its inverse in Example 2 are shown in Fig. 4. 


A 


co =x +x 


Figure 4 


When f: X —> Y is a function such that the set of images is all of Y, that is, 
fX) = Y, we call the function onto. 


Onto Function The function f: X —> Y is called onto, or surjective, if 
range(f) = Y. 


For example, the function of Example 2 is onto since the range of f is all of R. 
See Fig. 4. A function is called bijective if it is both one-to-one and onto. 

Notice that the function f: R —> R with f(x) = x? is not onto since range (f) = 
[0, 00). Of course, every function is a mapping onto its range. So the function 
f: R — [0, 0) defined by f(x) = x? is onto. This new version of the original 
function is not one-to-one, but by restricting the domain to [0, 00), we can define a 
version that is one-to-one and onto. That is, the function f: [0, oo) —> [0, co) defined 
by f(x) = x? is a bijection. The function defined in Example 2 is also bijective. Notice 
also that a function has an inverse if and only if it is bijective. 


Composition of Functions 


Functions can be combined in a variety of ways to create new functions. For example, 
if f: Xı — Yı and g: X2 —> Y are real-valued functions of a real variable, then 
the standard arithmetic operations on functions are defined by 


(f + g)Q) = f(x) + ga) 
(f — g)@) = f(x) — ga) 
GDE) = f@)g) 


DEFINITION 4 


THEOREM 5 


A.2 Functions 421 


The domains of these functions are given by dom(f +g) = dom(f — g) = 
dom( fg) = Xı N Xz and dom(f/g) = (X; N X2)\{x | g(x) = 0}. Another method of 
combining functions is through the composition of two functions. In the compo- 
sition of two functions f and g, the output of one function is used as the input 
to the other. For example, if f(x) = ./x and g(x) = x? —x —2, then fe) = 
f (4) = 2 is the composition of f with g evaluated at the number 3 and is denoted 


by (fog) (3). 


Composition Let A, B, and C be nonempty sets and f: B —> C and g: A — B 
be functions. The composition fog: A — C is defined by 


(Fog) x) = f(g) 
The domain of the composition is dom( fog) = {x € dom(g) | g(x) € dom(f)}. 


A function and its inverse undo each other relative to composition. For example, 
let f(x) = 2x — 1. Since f is one-to-one, it is invertible with fix) = (x + 1)/2. 
Notice that 
f@+1 2x-14+1 > 


-15 — f! 2 
F efPa)= fF FE 5 5 


and 
x+1 


2 


Goar rs ( ) eee eee 


Suppose that f: X — Y is a bijection. Then 


1. (f-!of)(x) = x for all x € X 
2. (fof—)(x) = x for all x € Y 


As mentioned earlier, when an inverse function exists, it is unique. To see this, 
let f: X — Y be an invertible function and f~! an inverse function. Suppose that 
g: Y — X is another inverse function for f. Let Ix be the identity function on X 
and 7y the identity function on Y. That is, x(x) = x for all x € X and Iy(y) = y for 
all y € Y. If y is in Y, then 


g) = goly(y) = go( fof ')(y) 
= gf ATO) = (eo fof!) 
= IXO) = F'O) 


Since this holds for all y in Y, then g = f~!. Consequently, when it exists, the inverse 
function is unique. This justifies the use of the symbol f~! for the inverse of f, when 
it exists. 


422 


Appendix A Preliminaries 


THEOREM 6 


THEOREM 7 


THEOREM 8 


Let f: X —> Y be a bijection. Then 


1. f7!: Y — X is also a bijection, 


2. (fy = f 


Let A, B, and C be nonempty sets and f: B —> C and g: A — B be functions. 


. If f and g are injections, then fog is an injection. 
. If f and g are surjections, then fog is a surjection. 
. If f and g are bijections, then fog is a bijection. 


. If fog is an injection, then g is an injection. 


nan kW N = 


. If fog is a surjection, then f is a surjection. 


Proof (1) Suppose that x; and x2 are in A and (fog)(x,) = (fog)(x2). Then by 
the definition of composition, we have 


f(g) = f(g @2)) 


Since f is an injection, g(x1) = g(x2). But since g is also an injection, we have 
xı = x2. Therefore, fog is an injection. 


(5) Let c € C. Since fog: A — C is a surjection, there is some a € A such that 
(fog)(a) = c. That is, f(g(a)) = c. But g(a) € B, so there is an element of B 
with image under f equal to c. Since c was chosen arbitrarily, we know that f is 
a surjection. 


The proofs of parts 2, 3, and 4 are left as exercises. 


Let A, B, and C be nonempty sets and f: B —> C and g: A — B be functions. If 
f and g are bijections, then the function fog has an inverse function and (fog)~! = 
eet ee 

Proof By Theorem 7, the composition fog: A —> C is a bijection; hence by 
Theorem 4, the inverse function (fog)~!: C — A exists. Moreover, the function 
g 'of—! also maps C to A. For each c € C we will show that (fog)~!(c) = 
(eto fi): Let c € C. Since f is onto, there is b € B such that f(b) = c, so 
that b = fo. Next, since g is onto, there is ana € A such that g(a) = b, which 
is equivalent to a = g l(b). Taking compositions gives (fog)(a) = f(g(a)) = c, 
and hence (fog) (ec) = a. We also have ef) z (C f—!)(e) = a. Since 
this holds for all c € C, the functions (fog)~! and g~!of~'are identical, that is, 
(Fog! = g'o f! 


Exercise Set A.2 


In Exercises 1—10, let 


X= {1,2, 3,4, 5, 6} 
Y = {—2, —1, 3,5,9, 11, 14} 


and define f: X —> Y by the set of ordered pairs 


2 N ANA WwW NY p 


{d, —2), (2, 3), 3, 9), (4, —2), 5, 11), (6, —1)} 


. Explain why f is a function. 


Is f a one-to-one function? Explain. 


. Is f an onto function? Specify range( f). 

. Let A = {1, 2, 4}. Find f(A). 

. Find f—!({—2}). 

. Find f-!(f({1))). 

. Does f have an inverse function? Explain. 


. Is it possible to define a function with domain X 


that is onto Y? Explain. 


. Define a function g: X — Y that is one-to-one. 
10. 


Is it possible to define a function g: Y —> X that 
is onto? Explain. 


In Exercises 11—14, use the function f: R —> R 
defined by 


11. 


12. 


13. 


14. 


fœ =x? 
Let A = (—3, 5) and B = [0, 7). Verify that 
f(A U B) = f(A) U f(B) 
Let C = [1, œ) and D = [3, 5]. Verify that 
FCU D) = f'U f(D) 


Let A = [—2, 0] and B = [0, 2]. Verify that 
f(AN B) C f(A) fB) 


and the sets are not equal. 


Define a function g by the rule g(x) = x?, but 
with the domain restricted to the interval [0, 00). 
If A = [0, 5) and B = [2, 7), verify that 


8(AN B) = g(A)Ng(B) 
What property does g have that f does not? 


423 


A.2 Functions 


15. 


16. 


17. 


18. 


19. 


20. 


21. 


Define a function f: R > R by f(x) =ax +b, 
where a and b are real numbers with a Æ 0. Find 
the inverse function of f. 


Define a function f: R > R by f(x) = x5 + 2x. 
Show that the inverse function of f exists. 


Given a function f, define for each positive 
integer n 


f" (x) = (fofo---of)(x) 


where the composition is taken n — 1 times. If c 
is a fixed real number and f(x) = —x + c, find 
Sf’ (c) for all n. 


Define a function f: R > R by 


2x if0 <x 
ISN a ifi<x 


Sketch the graphs of y = f(x) and y = (fe f)(x). 
Define a function f: R —> R by 
f@=er! 


a. Show that f is one-to-one. 
b. Is f onto? Justify your answer. 


c. Define a function g with the same rule and 
domain as f but that is onto. 


d. Find the inverse function for the g defined in 
part (c). 
Define a function f: R > R by 
ay? 
f(x) =e 
Show the function is not one-to-one. 


Define a function f: N > N by 
f(n) =2n 


a. Show that f is one-to-one. 

b. Is f onto? Explain. 

c. If E denotes the set of even positive integers 
and O the odd positive integers, find f—!(E) 
and f -1(0). 


424 


Appendix A Preliminaries 


22. Define a function f: Z > Z by c. Let A be the set of all points that lie on the 
Ber a een line y = x + 1. Find f(A). 
fn) = n—3 if n is odd In Exercises 25-27, f: X — Y is a function, A and B 


23. 


24. 


Let E denote the set of even integers and O the 


set of odd integers. Find 


Define a function f: Z x Z > Z by 
f((m,n)) =2m+n 


are subsets of X, and C and D are subsets of Y. Prove 
the statements. 


25. f(AU B) = f(A) U f(B) 
26. f-1(CU D) = f7! (C) U f(D) 
27. FFO EC 


f(E) and f(O). 


a. Let A = {(p, 4) | p and q are odd}. Find In Exercises 28-30, f: B > C and g: A > B are 


f(A). 


functions. Prove the statements. 


b. Let B = {(p,q) | q is odd}. Find f(B). 


. Find f~!({0}). 


a 0 


28. If f and g are surjections, then fog is a 
surjection. 


. Let E denote the set of even integers. Find 


f'E). 29. If f and g are bijections, then fog is a bijection. 
e. ees denote the set of odd integers. Find 30. If fog is an injection, then g is an injection. 
f~ (0). 


f. Show that f is not one-to-one. 


g. Show that f is onto. 


Define a function f: R? 


fœ, y)) = 


31. If f: X — Y is a function and A and B are 
subsets of X, show that 


—> R? by F(A)\ F(B) E f(A\B) 


(2x, 2x + 3y) 32. If f: X — Y is a function and C and D are 
subsets of Y, show that 


a. Show that f is one-to-one. i i , 
b. Is f onto? Justify your answer. fU(C\D) = f (CAF D) 


A.3 > 


Techniques of Proof 


Mathematics is built on facts. A few of these, called axioms, are accepted as self- 
evident and do not require justification. Every other statement of fact requires proof. 
A proof is the process of establishing the validity of a statement. Results in math- 
ematics that require proof are called theorems and are made up of two parts. The 
first part, called the hypothesis, is a set of assumptions. The second part, called the 
conclusion, is the statement that requires proof. It is customary to use the letter P to 
denote the hypotheses (or hypothesis if there is only one) and the letter Q to denote 
the conclusion. A theorem is symbolized by 


P= Q 


which we read as “if P, then Q” or “P implies Q” or “P is sufficient for Q.” The 
converse of a theorem is symbolized by 


Q = P 


Solution 


A.3 Techniques of Proof 425 


read as “Q implies P” or “P is necessary for Q.” For example, let P be the statement 
Mary lives in Iowa and Q the statement that Mary lives in the United States. Then 
certainly P ==> Q is a theorem since every resident of Iowa is a resident of the 
United States. But Q => P is not a theorem since, for example, if Mary is a 
resident of California, then she is a resident of the United States but not a resident of 
Iowa. So the statement Q ==> P is not always true given that Q is true. In terms 
of sets, if A is the set of residents of Iowa and B is the set of residents of the United 
States, then the statement P is Mary is in A and Q is Mary is in B. Then Mary is in A 
implies Mary is in B. It is also clear that if Mary is in B\A, then Mary is in B does 
not imply that Mary is in A. 

A statement that is equivalent to the theorem P = > Q is the contrapositive 
statement ~Q = > ~P, that is, not Q implies not P. In the example above, if Mary is 
not a resident of the United States, then Mary is not a resident of Iowa. An equivalent 
formulation of the statement, in the terminology of sets, is that if Mary ¢ B, then it 
implies Mary ¢ A. 

There are other statements in mathematics that require proof. Lemmas are pre- 
liminary results used to prove theorems, propositions are results not as important as 
theorems, and corollaries are special cases of a theorem. A statement that is not yet 
proven is called a conjecture. One of the most famous conjectures is the celebrated 
Riemann hypothesis. A single counterexample is enough to refute a false conjec- 
ture. For example, the statement A// lions have green eyes is rendered invalid by the 
discovery of a single blue-eyed lion. 

In this section we briefly introduce three main types of proof. A fourth type, 
called mathematical induction, is discussed in Sec. A.4. 


Direct Argument 


In a direct argument, a sequence of logical steps links the hypotheses P to the 
conclusion Q. Example 1 provides an illustration of this technique. 


Prove that if p and q are odd integers, then p + q is an even integer. 
To prove this statement with a direct argument, we assume that p and q are odd 
integers. Then there are integers m and n such that 
D = Xin 4p Il and G = 2a 
Adding p and q gives 


pt+q=2m+1+2n+1 
= 2(m+n)+2 
=2(m+n-+ 1) 


Since p+ q is a multiple of 2, it is an even integer. 


426 Appendix A Preliminaries 


Solution 


Solution 


Contrapositive Argument 


The contrapositive statement of the statement P => Q is the statement 
~Q => ~P. The notation ~Q denotes the negation of the statement Q. A state- 
ment and the contrapositive statement are equivalent, so that if one holds, then the 
other also holds. In a contrapositive argument the hypothesis is ~Q, and we proceed 
with a direct argument to show that ~P holds. 


If p? is an even integer, then p is an even integer. 


In a direct argument we assume that p? is even, so that we can write p? = 2k for 


some integer k. Then 
pele = VIs k 


which does not allow us to conclude that p is even. 

To use a contrapositive argument, we assume that p is not an even integer. 
That is, we assume that p is an odd integer. Then there is an integer k such that 
p = 2k + 1. Squaring both sides of the equation p = 2k + 1 gives 

p? = (2k +1)? 
=4k°+4k4+1 
= 2(2k? + 2k) +1 


and hence p° is an odd integer. Therefore, the original statement holds. 


Contradiction Argument 


In a contradiction argument to show that a statement holds, we assume the contrary 


and use this assumption to arrive at some contradiction. For example, to prove that 


the set of natural numbers N is infinite, we would assume the set of natural numbers is 


finite and argue that this leads to a contradiction. A contrapositive argument is a form 
of contradiction where to prove P = > Q, we assume that P holds and ~Q holds 
and arrive at the conclusion that ~P holds. Since both P and ~P cannot be true, we 


have a contradiction. In certain cases the contradiction may be hard to recognize. 


Prove that ./2 is an irrational number. 


To use a contradiction argument, we assume that ,/2 is not irrational. That is, we 
assume that there are integers p and q such that 


v2=" 
q 


A.3 Techniques of Proof 427 


where p and q have no common factors. We will arrive at a contradiction by 
showing that if /2 = p/q, then p and q do have a common factor. Squaring both 
sides of the last equation gives 
2 k so that P = 2q? 

q 
Hence, p? is even. Since p? is an even integer, then by Example 2 so is p. Thus, 
there is an integer k such that p = 2k. Substituting 2k for p in the equation 2g? = p? 
gives 

24? = p = (2k)? = 4k? sothat q? = 2k? 

Hence, q is also an even integer. Since p and q are both even, they have a common 
factor of 2, which contradicts the assumption that p and q are chosen to have no 
common factors. 


Quantifiers 


Often statements in mathematics are quantified using the universal quantifier for 
all, denoted by the symbol VY, or by the existential quantifier there exists, denoted 
by the symbol 3. If P(x) is a statement that depends on the parameter x, then the 
symbols 


Yx, P(x) 


are read for all x, P(x). To prove that the statement is true, we have to verify that the 
statement P(x) holds for every choice of x. To prove that the statement is false, we 
need to find only one x such that P(x) is false, that is, we need to find a counterex- 
ample. To prove that a statement of the form 


dx, P(x) 


holds requires finding at least one x such that P(x) holds. The statement is false if 
the statement 


~x, P(x)) 


holds. When we negate a statement involving quantifiers, ~3 becomes Y and ~Y 
becomes 3. So the statement 


~(ax, P(x)) is equivalent to Vx, ~P (x) 
and the statement 


~(Vx, P(x)) is equivalent to dx, ~P(x) 


428 


Appendix A Preliminaries 


Exercise Set A.3 


1. 


10. 


11. 


12. 


13. 


14. 


15. 


16. 


Prove that in an isosceles right triangle, the 
hypotenuse is 2 times the length of one of the 
equal sides. 


. Prove that if ABC is an isosceles right triangle 


with C the vertex of the right angle and sides 
opposite the vertices a, b, and c, respectively, then 
the area of the triangle is c?/4. 


. Prove that in an equilateral triangle the area of the 


triangle is /3/4 times the square of the length of 
a side. 


. Prove that if s and f are rational numbers with 


t #0, then s/t is a rational number. 


. Prove that if a, b, and c are integers such that a 


divides b and b divides c, then a divides c. 


. Prove that if m and n are even integers, then 


m +n is an even integer. 


. Prove that if n is an odd integer, then n? is an odd 


integer. 


. Prove that if n is in N, then n? +n + 3 is odd. 


. Prove that if a and b are consecutive integers, 


then (a + b)? is an odd integer. 


Prove that if m and n are odd integers, then mn is 
an odd integer. 


Show that the statement ifm and n are two 
consecutive integers, then 4 divides m? + n? is 
false. 

Let f(x) = (x — 1)? and g(x) =x + 1. Prove 
that if x is in the set S = {x € R|0 <x <3}, 
then f(x) < g(x). 

Prove that if n is an integer and n? is odd, then n 
is odd. 


Prove that if n is an integer and n? is even, then n 
is even. 


Prove that if p and q are positive real numbers 
such that ./pq 4 (p + q)/2, then p # q. 


Prove that if c is an odd integer, then the equation 
n? +n — c = 0 has no integer solution for n. 


17. 


18. 


19, 
20. 


21. 


22. 


23. 


Prove that if x is a nonnegative real number such 
that x < €, for every real number € > 0, then x = 0. 


Prove that if x is a rational number and x + y is 
an irrational number, then y is an irrational 
number. 


Prove that </2 is irrational. 
Prove that if n in N, then 
n n 
> 
n+1 n+2 
Suppose that x and y are real numbers with 


x < 2y. Prove that if 7xy < 3x? + 2y?, then 
3x < y. 


Define a function f: X — Y and sets A and B in 
X that is a counterexample to show the statement 


If f(A) C f(B), then A C B 
is false. 


Define a function f: X — Y and sets C and D in 
Y that is a counterexample to show the statement 


If f-'(C) € f(D), then C C D 


is false. 


In Exercises 24-30, f: X — Y is a function, A and B 
are subsets of X, and C and D are subsets of Y. Prove 
the statements. 


24. 
25. 
26. 


27. 


28. 


29. 


If A C B, then f(A) € f (B). 
If C C D, then f-'(C) € f-!(D). 
If f is an injection, then for all A and B 


f(AN B) = f(A)N f(B) 
If f is an injection, then for all A and B 
S(A\B) = f(A)\ S (B) 
If f is an injection, then for all A 
FFA =A 
If f is a surjection, then for all C 


FCO) = C 


A.4 Mathematical Induction 429 


A.4 > Mathematical Induction 


Throughout mathematics there are statements that depend on natural numbers and 
where the aim is to determine whether the statement is true or false for all natural 
numbers. Some simple examples are the following three statements, the third being a 
well-known puzzle, called the Tower of Hanoi puzzle. 


1. For every natural number n, the sum of the first n natural numbers is given by 
n(n + 1) 

2 

2. The expression 6n + 1 is a prime number for every natural number n. 

3. Given three pegs, labeled 1, 2, and 3, and a stack of n disks of decreasing 
diameters on peg—1, the disks can be moved to peg—3 in 2” — 1 moves. This is 
under the restriction that a disk can be placed on top of another disk only when 
it has smaller diameter. 


14+24+3+---+n 


When we are considering a statement involving natural numbers to provide 
insight, a useful first step is to substitute specific numbers for n and determine whether 
the statement is true. If the statement is false, often a counterexample is found quickly, 
allowing us to reject the statement. For example, in the second statement above, for 
n = 1,2, and 3 the expression 6n + 1 has values 7, 13, and 19, respectively, all of 
which are prime numbers. However, if n = 4, then 6(4) + 1 = 25, which is not a 
prime number, and the statement is not true for all natural numbers n. 

In the case of the first statement, the data in Table 1 provide more convincing 
evidence that the formula may indeed hold for all natural numbers. Of course, to 
establish the fact for all n requires a proof, which we postpone until Example 1. 

For the Tower of Hanoi puzzle, when n = 1, the number of steps required is 1, 
and when n = 2, it is also easy to see a solution requiring 3 steps. A solution for 
n = 3 is given by the moves 


D3 — P3, D2 — P2, D3 — P2, D1 — P3, D3 — Pl, 
D2 —> P3, D1 —> P3 


Table 1 

14+24+3+4+---4n a 

1 W@ — 1 

14+2=3 29 — 3 
14+2+3=6 OM — 6 
14+2+34+4=10 AG) — 19 
1+2+3+4+5=15 SO — 15 
14+24+34445+6=21 OM = 21 
1+2+3+4+5+6+7=28 | 28% =28 


430 


Appendix A Preliminaries 


THEOREM 9 


where D1, D2, and D3 represent the three disks of decreasing diameters and P1, P2, 
and P3 represent the three pegs. So for n = 3, we have a solution with 7 = 2? — 1 
moves. Again, the evidence is leading toward the result being true, but we have not 
given a satisfactory proof. Let’s push this example a bit further. How can we use the 
result for three disks to argue that this result holds for four disks? The same sequence 
of steps we gave for the solution of the three-disk problem can be used to move the 
stack from P1 to either P2 or P3. Now, suppose that there are four disks on P1. 
Since the bottom disk is the largest, P1 can be used as before to move the top three 
disks. So as a first step, move the top three disks to P2, which requires 2? — 1 = 7 
moves. Next, move the remaining (largest) disk on P1 to P3, which requires 1 move. 
Now, using the same procedure as before, move the three-disk stack on P2 over to 
P3, requiring another 23 — 1 = 7 moves. The total number of moves is now 


2723 —1)+1=24-241=2*-1=15 


This approach contains the essentials of mathematical induction. We start with an 
initial case, called the base case, that we can argue holds. The next step, called the 
inductive hypothesis, provides a mechanism for advancing from one natural number 
to the next. In the Tower of Hanoi example, the base case is the case for n = 1, and 
one disk on P1 requires only 1 = 2! — 1 move to transfer the disk to P3 or P2. 
The inductive hypothesis is to assume that the result holds when there are n disks on 
P1. We are required to argue the result holds for n + 1 disks on P1. We did this for 
n=3. 

Theorem 9 provides a formal statement of the principle of mathematical induction. 
The proof of this statement, which we omit, is based on the axiomatic foundations of 
the natural numbers. Specifically, the proof uses the well-ordering principle, which 
states that every nonempty subset of N has a smallest element. 


The Principle of Mathematical Induction 
Let P be a statement that depends on the natural number n. Suppose that 


1. P is true for n = 1 and 
2. When P is true for a natural number n, then P is true for the successor n + 1 


Then the statement P is true for every natural number n. 


The principle of mathematical induction is also referred to as mathematical induc- 
tion, or simply induction. 

An analogy to describe the process of mathematical induction is an infinite row 
of dominoes that are toppled one domino at a time, starting with the first domino. If 
the dominoes are set up so that whenever a domino falls its successor will fall (the 
inductive hypothesis), then the entire row of dominoes will fall once the first domino 
is toppled (base case). 


Solution 


Solution 


A.4 Mathematical Induction 431 


The principle of mathematical induction is used to prove a statement holds for 
all natural numbers, or for all natural numbers beyond a fixed natural number. This 
is illustrated in the following examples. 


Prove that for every natural number n, 


n 


Sok =142434---4n= 
p= 


n(n + 1) 
2 


To establish the base case when n = 1, notice that 


,_ D2) 
2 
The inductive hypothesis is to assume that the statement is true for some fixed 


natural number n. That is, we assume 

B n(n + 1) 
Epe 

Next, add n + 1 to the left hand side of the last equation to obtain 


1+2+3+---+n 


1+2+3+---+n+ nt O= ee 
and we apply the inductive hypothesis to conclude 
1+2+3+++-+n + (1) E a a a, 


a +(n+1) 
2 
_ @t Dn $2) 
E 2 
The last equality agrees with the stated formula for the successor of n, that is, for 
n + 1. Therefore, by induction the statement holds for all natural numbers. 


Prove that for every natural number n, the number 3” — 1 is divisible by 2. 


In Table 2 we have verified that for n = 1,2,3,4, and 5 the number 3” — 1 is 
divisible by 2. 

In particular, if n = 1, then 3” — 1 = 2, which is divisible by 2. Next, we 
assume that the statement 3” — / is divisible by 2 holds. To complete the proof, 
we must verify that the number 3”*! — 1 is also divisible by 2. Since 3” — 1 is 


432 Appendix A Preliminaries 


divisible by 2, then there is a natural number q such that 


Table 2 
n 3" -1 3" —1=2q which gives 3" =2q+1 
1 2 Next, we rewrite the expression 3”+! — 1 to include 3” in order to use the inductive 
hypothesis. This gives 
a : grt] _ 4 = 3G") -1 
3 26 = Scr I) = il 
4 80 =6q¢+2 
5 242 = 23q + 1) 


Therefore, the expression 3”*! — 1 is also divisible by 2. 


Recall that factorial notation is used to express the product of consecutive natural 
numbers. Several examples are 


I!=1 
2) 162 = 2 
3!=1-2-3=6 


41=1-2-3-4=24 


20! = 2, 432, 902, 008, 176, 640, 000 
For a natural number n, the definition of n factorial is the positive integer 
n!=n(n—1)(n—2)---3-2-1 
We also define 0! = 1. 


| EXAMPLE3 | Verify that for every natural number n, 


pls De" 


Solution Forn = 1 the statement is true, since n! = 1! = 1 and 2”—! = 2° = 1. Now assume 
that the statement n! > 2”—! holds. Next, we consider 


(n+1)!=(n+1)n! 


which we need to show is greater than or equal to 2”. Applying the inductive 
hypothesis to n! gives the inequality 


(n+ D!= +12"! 


Solution 


A.4 Mathematical Induction 433 


Since for every natural number n > | it is also the case that n + 1 > 2, we have 
(nt+)D!> nt D2 


Consequently, the statement n! > 2”! is true for every natural number n. 


For any natural number n, find the sum of the odd natural numbers from 1 to 
2n — 1. 


The first five cases are given in Table 3. 


Table 3 


14+34+5+7=16 
14+34+5+74+9=25 


The data in Table 3 suggest that for each n > 1, 


1434547+--+(Qn-D=n’ 


Starting with the case for n = 1, we see that the left-hand side is 1 and the 
expression on the right is 17 = 1. Hence, the statement holds when n = 1. Next, 
we assume that 1+ 3 + 5+ --- + (2n — 1) =n’. For the next case when the index 
is n + 1, we consider the sum 


1+3+5+-- -+ (2n-1)4+2 4 D I =1 43-5 een 


Using the inductive hypothesis, we get 
14+34+5+4+---+ Qn -—1)+2an+1)—1]=n+@2n+1) 
n2 
=n +2n+1 
=(n+1? 


Therefore, by induction the statement holds for all natural numbers. 


434 Appendix A Preliminaries 


| EXAMPLES | Let Pi, P2,..., Pa be n points in a coordinate plane with no three points collinear 


(in a line). Verify that the number of line segments joining all pairs of points is 


n? —n 


2 


Solution In Fig. 1 is a picture for the case with five points. The number of line segments 
connecting pairs of points is 10 = (5? — 5)/2. 


P3 


Pi 


Py 


Figure 1 


If one additional point is added to the graph in Fig. 1, the result is the graph 
shown in Fig. 2. Moreover, adding the one additional point requires adding five 
additional line segments, one to connect the new point to each of the five original 
points. In general, an additional n line segments are required to move from a graph 
with n points to one with n + 1 points. 


Figure 2 


These observations lead to the following proof by induction. 

If there is only one point, then the graph contains no line segments. Also since 
(1? -— 1) /2 = 0, the statement holds for n = 1. Next, assume the number of line 
segments needed to join n points in a coordinate plane is (n? — n)/2. If there is one 
additional point, that is, n + 1 points, then n additional line segments are required. 
Hence, by the inductive hypothesis, the total number of line segments required for 


A.4 Mathematical Induction 435 


n + 1 points is 


n>—n n>—n+2n 
wn 


n?>+2n+1-1—n7 
2 
(n+ - an+) 
> 2 
Therefore, by induction the statement holds for all natural numbers. 


Binomial Coefficients and the Binomial Theorem 


In Fig. 3 are the first eight rows of Pascal’s triangle. Notice that each element can 
be obtained from the sum of the two elements to the immediate left and right in the 


row above. 


1 1 

1 2 1 
1 3 3 1 
1 4 6 4 1 
1 5 10 10 5 1 
1 6 15 20 15 6 1 
1 7 21 35 35 21 7 1 
Figure 3 


In Fig. 4 are the expansions for (a + b)” for n = 0, 1, 2,3,..., 7. The coefficients 
of the expansions are exactly the numbers in Pascal’s triangle. 


(a+b)? 1 
(a+b)! a+b 

(a+b)? a? + 2ab + b? 

(a+b) a? + 3a*b + 3ab? + b? 

(a+b)* at + 4a3b + 6a7b* + 4ab? + bt 

(a+b) a> + 5atb + 10a?b? + 10a°b? + 5ab* + b’ 

(a + b)® af + 6a°b + 15a*b? + 20a3b? + 15a7b* + 6ab? + b® 


(a+b)! a’ +7a°b + 21a>b? + 35a*b? + 35a3b* + 21a7b? + Tab® +b? 
Figure 4 


The numbers in Pascal’s triangle or the coefficients of an expansion of the form 
(a + b)” are called the binomial coefficients. Notice that the number 20, in Fig. 3, is 


436 Appendix A Preliminaries 


located in row 6 (starting with a row 0) and column 3 (starting with a column 0). In 
addition, using factorials, we have the formula 


6! © 1:2:3-4-5-6 
316-3)! (1-2-3)(1-2-3) 
4.5.6 
~ 1.2.3 
= 20 


DEFINITION 1 Binomial Coefficient Forn > 0and0 <r < n, the binomial coefficient ( $ ) 


is defined by 
n = n! 
rj) r\n—r)! 


We observed above that entries in Pascal’s triangle can be obtained from the sum 
of the two elements to the immediate left and right in the row above. The next identity 
is the equivalent statement about binomial coefficients. 


PROPOSITION 1 If k and r are natural numbers such that 0 < r < k, then 


aa 
= + 
r r— i1 r 
Proof First observe that 
r!=r(r— 1)! and (k—r)!=(k—r)(k—r-— 1)! 


Expanding the binomial coefficients using factorials gives 


E ae (k—1)! 5 (k — 1)! 
r= 1 r ~ = DIE= 1D) = =D ri(k-1-r)! 


1 1 
En F Zan Meare = 


7 (k—1)! 1 1 
= ooo (ets) 


(k—1)! r+(k—r) 
E a: | r(k—r) | 
(k—1)! k 
~ F_-Dik—-r—D! oo 

k! 
~ Fk)! 


(5) 


A.4 Mathematical Induction 437 


THEOREM 10 Binomial Theorem Ifa and b are any numbers and n is a nonnegative integer, 
then 


n_ n n n n—1 n n—2,2 
a+o"= (9 ehi Ja b+(3 )a b 
ee (Tarren ( n ) aor (7) 
r n—1 n 


Proof The proof is by induction on the exponent n. If n = 1, then (a+ b)” = 


a + b, and 
n 1 n 1 1 1 = 
(5 )a+(7 et =(9 ehi jonas 


Therefore, the statement holds for the case n = 1. Next assume that the statement 


a+or=( p eth i Jeth p ) apr ( n ) 


holds. For the next case, we consider (a + b)"t! = (a + b)(a + b)” and apply the 
inductive hypothesis. This gives 


(a+ byt! = (a + b) (a + b)" 


JOC eee as 
n—-1 n 
oo) eee 
n—-1 n 
po ae 
+ (5 e+ (7 Jeet n aor (7 ore 
0 1 n—-1 n 
Now, combine the terms with the same exponents on a and b to obtain 
n+l __ n n+l n n n n n n—1,2 
esot = (o Jets] 5 )+(7 eos |( 1 +05 e 
Helge) ga 
n—1 n n 


-= 3 
A 
Q 
= 
> 
+ 
+ 
D N 
3 
| a 
— 
REA 
Q 
N 
ke p 
3 
| 
oS 
3 3 
Ny 
a 
> 
3 


438 


Appendix A Preliminaries 


Finally by repeated use of Proposition 1, we have 


aroo ("9 Jamia (nyt Jepa ( "3! 


) a" |p 


n+1 j 


n+1 a 
+a n ) ab Gen 


Therefore, by induction the statement holds for all natural numbers. 


Exercise Set A.4 


In 


Exercises 1—10, use mathematical induction to 


show that the summation formula holds for all natural 
numbers. 


1 


2 


3. 


2 2. 2: 2 _ n(n+1)2n+1) 
. 14 4+2°43%+---+2n —s 

3 3 3 3 _ (n+l)? 
~ 1 +2°4+3°+---+n e 


1+4+7++- + Gn — 2) = 2S 


4. 3+11+19+--- + (8n —5)= 4r? -n 


o N A 


10. 


12. 


.2+5+8+--- + 8n — 1) = 2&0 


.3+7+11 +-+ Gn DS nas DY) 
~34+6494---4+3n = #6 
.1-2+2-3+3-4 

a + 1) = Reet) 
. ot, 2k ont 2 
werk: k!=(n+1)!-1 


. Find a formula for all natural numbers n for the 
sum 
24+4+6+8+4+---+2n 


Verify your answer, using mathematical induction. 


Find a formula for all natural numbers n for the 


sum a 
S (4k — 3) 
k=1 


13. 


14. 


15. 


16. 


17. 


18. 


Show that for all natural numbers n > 5, the 
inequality 2” > n? holds. First show the inequality 
holds for n = 5, and then proceed to the second 
step when using mathematical induction. 


Show that for all natural numbers n > 3, the 
inequality n? > 2n + 1 holds. First show that the 
inequality holds for n = 3, and then proceed to the 
second step when using mathematical induction. 


Show that for all natural numbers n the 
expression n? + n is divisible by 2. 


Show that for all natural numbers n the 
expression x” — y” is divisible by x — y. Note 
that x? — y? is divisible by x — y since 

x? — y? =(x+y)@—y). 

Use mathematical induction to show that for a 
real number r and all natural numbers n, 
r”’—1 


r—1l 


ltrtrtrgte trea 


Let fan denote the nth Fibonacci number. 


a. Determine the sum of the first n Fibonacci 
numbers for n = 2,3, 4, and 5. That is, 
determine fi + f2, fi + Ja Jax fib f+ 
fh + fa, and fi + fpot+ fat fat fs. 

b. Find a formula for the sum of the first n 
Fibonacci numbers. 

c. Show that the formula found in part (b) holds 
for all natural numbers. 


A.4 Mathematical Induction 439 


19. Let A, Bı, B2, ... be sets. Prove that for every 21. Verify that if 0 < r <n, then 
natural number n, 
AN (B,U B2U---UB,) (* )=( n ) 
r n—r 


= (ANB) Us: U(ANB,) 


22. Verify that 
20. Show that for every natural number n, a 2” x 2” 


grid of squares with one square removed can be A ñ ti 
covered with copies of the shape ( r—1 ) + ( 3 ) = ( E ) 


HH 23. Show that 


as shown in the figure. 


24. Show that 


n P ñ 7 
Soon(t)- 


| Answers to Odd-Numbered Exercises 


Chapter 1 x + y = 2 
“| 3x + 3y = -6 
Section 1.1 y 
1. x1 =3,x2 = 8, x3 = —4 41. a. S = {(3 — 2s — t,2 +s — 2t,s,t) | s,t € R} 
3. xı =2— 3x4,xX2 = l1 — x4, X3 = —1 — 2x4, x4 € R b. S = {(7 — 2s — 5t,s,—2 + s + 2t, t) | s,t € R} 
5. x =0,y = —4 43.a. k=3 
7.x=1,y =0 b. k = —3 
9. 5 = { (2#,1)| re R} Hai 
11. x=0,y=1,z=0 Section 1.2 
13. S = { (-1 — 5t,6t + 4,1)|t € R} 1. | | d 
15. S = { (—t + $,-3,¢)|t € R} 
5 4 2 0 -1/)/4 
17. S = { (3 — $t, —s — $t +3,s,t)|s,t € R} 3114 112 
19. x = —2a + b,y = —3a + 2b 4 1 -1}1 
21. x = 2a + 6b — c,y =a +3b,z = —2a — 7b +c r 2 0 -1/4 
23. Consistent if a = —1 "|1 4 1]|2 
25. Consistent if b = —a 2 4 2 2| -2 
27. Consistent for all a,b, and c such that c — a — b = 0 7,14 2 3 2 2 
29. Inconsistent if a = 2 1 3 3 —3 | —4 


31. Inconsistent for a Æ 6 9. x = —l,y = OPEN 
2 . a P e T o 
33. y = (x = 5) — 2; vertex: (3. —2) 11. x = —3 — 2z,y =2 +z,z ER 
35. y = — (x — 2)? + 3; vertex: (2,3) 13. x = —3 + 2y,z =2,y € R 
37. a. (2,3) 15. Inconsistent 


17. x =3 + 2z — 5w,y =2 +z — 2w,z € R,w e R 
19. x = 1 +3w,y =7 +w,z = —l — 2w,w € R 
21. In reduced row echelon form 


b. ay 


23. Not in reduced row echelon form 
25. In reduced row echelon form 


a 27. Not in reduced row echelon form 
1 0 
». [1°] 

1 0 0 

+ y = 2 31. 0 1 0 

9. a { a af 0 0 1 

TETEE IE 
2x + 2y = 2 


1 
35. | 0 

0 
37. x= 
39. x = l,y = 0, 
41. Inconsistent 
43. x, = -4 — 2x3, X2 = -4 + 5x3, X3 eR 
45. x =1— 5X4,X2 =l1- 54,23 =l1- 54,4 ER 


47. x, =1+ 4X3 + 5X4,X2 =2+ 3x3 + 3x4,X3 E R, 


49. 
.c—a+b#0 
. Infinitely many solutions. 


a 
b 
c 
d.a=1,b=0,c=1;x =-2,y =2,z =1 
a 
b 
c 


51. a.a+2b—c=0 

.a+2b-c #0 

. Infinitely many solutions 
d. a = 0,b = 0,c = 0; x = $y = $z = 1 


Section 1.3 


1 4+8=| 


1 —2 10 
-7 3 9 
24+B=|-3 10 6 
2 2 1l 
T 2 6 2 
peN g a e 
-9 4 
sa 5 al 
5 -6 4 
11. 4B=|3 6 -18 
5 -7 6 
1 3 
13.46 +0)=[ 43 al 
10 —18 
15. 4B ~3¢)= | y l 
7 5 
17. 24¢—B' =] -1 3 


= ee 


19. 


21. 


23. 


25. 


27. 


29. 


31. 


33. 


35. 


37. 


39. 
41. 


43. 


Answers to Odd-Numbered Exercises 441 


ST å 
Pa 

TES 

e 

(44B)C=] 6 8 

4 12 
0 20 15 
(4'C)B = 0 0 0 
-18 -22 -15 


eS 
4B = 4c =| 5 | 


1 0 1 b -1 b 
A has the form | 4 ane “1 le 0 al 


If AB = BA, then A2B = AAB = ABA = BAA = BA?. 
1 
Ifx= | . |, then Ax = 0 implies the first column of 
0 
0 
1 
A has all O entries. Then let x = . and so on, to 
0 


show that each column of 4 has all O entries. 

The only matrix is the 2 x 2 zero matrix. 

Since (AA‘)! = (A'Y A! = AA‘, the matrix AA’ is 
symmetric. Similarly, (4‘A)! = A'(A’)! = A'A. 

If A’ = —A, then the diagonal entries satisfy aj; = —ajj 
and hence aj; = O for each i. 


Section 1.4 
1| -1 2 
as ene 
naast 2] 
3. The matrix is not invertible. 
3 1 —2 
5. A! = | -4 -1 3 
—5 -l1 3 


442 


7. 


11. 


13. 


15. 


17. 


19. 


21. 
23. 


25. 


27. 


Answers to Odd-Numbered Exercises 


The matrix is not invertible. 
1 1 
3 71 -2 3z 
z 0 1 2 —l1 
SATE o oa d 
0 0 0 =} 
3 0 0 0 
—6 3 0 0 
Ai 
ATS 1 —2 -1 0 
1 1 1 1 
The matrix is not invertible. 
0 0 -1 0 
1 -1l -—2 1 
Sas 
a 1 —2 -1 1 
0 -1 -1 1 
AB +4= E 8 =A(B +1) 
~| 10 —-10 | —~ 
aB+B=| ¢ 3 |= 44D 
6 -3 
; 3 4 
2. = 
a. Since A4 = | ae | and 
-u=-| 7; T} | tene- 24 +s =0. 


b= 2 : Tt |= ger 


c. If A? — 2A + 5I = 0, then A* — 24 = —5/, so that 
A [ERI -A)| = 24-442 = -1 (4? — 24) = 
—i(-SI) =1. 

If } = —2, then the matrix is not invertible. 


a. If } Æ 1, then the matrix is invertible. 


ee oes CR re NS 
X-1 k=l X-1 
b. 1 1 1 
x1 x1 X-I 
0 0 1 
The matrices 
1 0 0 0 
al J and zeha 
are not invertible, but Æ + B = | ; 1 | is invertible. 


(4 + B)AT! (A — B) = (AA! + BA! (4 — B) 
= (I + BA7'\(A—B) 
=A-—B +B —BA`'B 
=A—BA'B 

Similarly, (4 — B)A7!(4 + B) = A — BAW'B. 


29. 


39. 


4l. a 


1. 


a. If A is invertible and 4B = 0, then 
AT! (AB) = A~!0, so that B = 0. 

b. If A is not invertible, then Ax = 0 has infinitely 
many solutions. Let x;,...,x, be solutions of 
Ax = 0 and B be the matrix with mth column vector 
Xn. Then AB = 0. 


. (ABY =B'A! = BA = AB 
. If AB = BA, then BT!AB = A, so B~'A = AB—!. Now 


(AB-!) = (BA = (B')7!A' = BNA = ABH, 


. If ÆA = A`! and B’ = Bo!, then 


(AB) = B'A' = B-'47! = (4B)"!." 


. a. (4ABCXCT!BT!AT!) = (4AB)CCT!(BT!AT!) 


= ABB! A! 
=4A! =] 
b. Case 1, k = 2: (4142)! = 45! 47! 
Case 2: Suppose that 


(4142: Ak)! = Ap Ap AT" 
Then 
(4142+ AkAk) T = (A142 + Ag ld yt)! 
= App [4142 +AT! 
-1 4—14-1 -1 
= Akk Ag Ai 
If A is invertible, then the augmented matrix [A|/] can 
be row-reduced to [/|A7!]. If A is upper triangular, then 
only terms on or above the main diagonal can be 
affected by the reduction process, and hence the inverse 
is upper triangular. Similarly, the inverse for an 


invertible lower triangle matrix is also lower 
triangular. 


| axı + bx3 


axı + bx4 e 1 0 
cxi + dx3 = 


cx. + dx4 0 1 


b. From part (a), we have the two linear systems 


{ ax; + bx3 = 1 ror | { 


cxi + dx3 = 0 
so 

(ad — bc)x3 = d (ad — bc)x4 = —b 
If ad — be = 0, then b = d = 0. 

c. From part (b), both b = 0 and d = 0. Notice that if 
in addition either a = 0 or c = 0, then the matrix is 
not invertible. Also from part(b), we have that 
axı = l,ax2 = 0,cx; = 0, and cx2 = 1. If a and c 
are not zero, then these equations are inconsistent 
and the matrix is not invertible. 


ll 
o 


axı + bx4 
cx. + dx4 


Il 
a 


and 


Section 1.5 


e S n 


11. 


13. 


15. 


17. 


19. 


21. 


23. 


25. 


2 -3 1 x 
A=|-l1 -l 2 |,x=]| y |, and 
3 -2 -2 Zz 

-1 
b= j| -l 
3 
A 3 BO es 7 
A=|-—3 -3 ı O|,x=| ~ |,and 
en re | mA 
X4 
-1 
b= 4 
3 
2x - Sy = 3 
2x poe 2 
— 2y = 3 
2x - y =- Zz = 1 
3x — y + 22 = -l 
2x, + 5x. — 5x3 + 3x4 = 2 
3x, + x2 — 2x3 — 4x, = 0 
1 
L= 4 
—3 
9 
a —3 
~ | —8 
di 
dazd —16 
~ 10 9 
-11 
L= 4 
12 
0 
i 0 
came 1 
-1 
a =z -3 


1f -7 
b. x= 5 | 8 | 
The general solution is 
s[i) 


with a particular nontrivial solution of x = —4 and 
y=l. 


Answers to Odd-Numbered Exercises 443 


1 2 
27. 4=| 1 2 
2 


29. From the fact that Au = Av, we have A(u — v) = 0. If 
A is invertible, then u — v = 0, that is, u = v, which 
contradicts the statement that u Æ v. 


1 
sac) i | 


Section 1.6 


1. The determinant is the product of the terms on the 
diagonal and equals 24. 


3. The determinant is the product of the terms on the 
diagonal and equals —10. 


5. Since the determinant is 2, the matrix is invertible. 
7. Since the determinant is —6, the matrix is invertible. 
9. a-c. det(4) = —5 


ad? E- 
d. det 3 -1 4]/)]=5 
2 0 1 


e. Let B denote the matrix in part (d) and B’ denote 
the new matrix. Then det(B’) = —2 det(B) = 
—10. Then det(4) = 4 det(B’). 


f. Let B” denote the new matrix. The row operation 
does not change the determinant, so 
det(B”) = det(B’) = —10. 
g. Since det(A) ¥ 0, the matrix A does have an 
inverse. 
11. Determinant: 13; invertible 
13. Determinant: —16; invertible 
15. Determinant: 0; not invertible 
17. Determinant: 30; invertible 
19. Determinant: —90; invertible 
21. Determinant: 0; not invertible 
23. Determinant: —32; invertible 
25. Determinant: 0; not invertible 
27. det(3A) = 37 det(A) = 270 
1 1 1 
det(24) 23 det(4) 80 


29. det((2A)~!) = 


444 


31. Since the determinant of the matrix is —5x? + 10x = 
—5x(x — 2), the determinant is 0 if and only if x = 0 or 


33. 


35. 


37. 


39. 


41. 


Answers to Odd-Numbered Exercises 


x= 2. 
= foe + pene 
1 —1 -2 
a. A= | —1 2 3 
2 -2 -2 
b. det(4) = 2 


c. Since the coefficient matrix is invertible, the linear 
system has a unique solution. 


b. det(4) = 0 


c. Since the determinant of the coefficient matrix is 0, 
A is not invertible. Therefore, the linear system has 
either no solutions or infinitely many solutions. 


d. No solutions 


a. y? x y l 
4 -2 OF 1 
4 3 21 
9 4 -3 1 
= — 29y? + 20x — 25y + 106 = 0 

b. i. 

5 

ü x? y? x y il 
0 16 0 -4 1 
0 140 4 1 
1 4 1 22 1 
4 92 31 
=136x? 


— 16y? — 328x + 256 = 0 


45. 


47. 


49. 


51. 
53. 


b. AY 
St 
=e H= T Hi : 

—54 

a. x? xy y? x y 

1 0 0 -1 0 

0 0 1 01 

1 0 0 10 

4 44 22 

9 3 1 31 


jk p ee ee 


= —12 + 12x? — 36xy + 42y? — 30y = 0 


7 -5 5 7 

6 —3 9 2 6 16 
x= = ry y = = 

5 =5 5 5 —5 5 

2 -3 2 -3 

3 -4 =9 3 

—10 5 25 -7 -10 
E S = > y — 

—9 —4 73 —9 —4 

-7 5 -7 5 

eee 
© B 

4 -3 -1 4 

3 4 25 -8 3 29 
x= = y= = 

-1 -3 28 -1 -3 28 

—8 4 —8 4 

160 10 42 
x =~ m} = jos? = 703 
Expansion of the determinant of A across row one 


equals the expansion down column one of 4’, so 


det(A) = det(4’). 


Section 1.7 


—4 
1 0 0 
3.a. E=|0 1 
0 -3 1 
I 2 1 
b. EA=| 3 ı 22 
-8 —2 -10 


5. a. J = F3E2E\A 


Sl elas le ae 


b. A=E,'E,'E;! 


=| 2 ilfo w][o i] 


7.a. I = E5E,E3E2E\A 


1 0 0 1 0 0 
E=| -2 1 0 Fy=| 0 1 0 
0 0 1 -1 0 1 
1 -2 0 1 0 1l 
E=|0 1 0 Ex=|0 1 0 
0 0 1 00 1 

10 0 

Es=|0 1 —5 

00 1 

b. A=E,'E;'E;'E,'E,' 
9. a. I = E6.. E\A 

0 1 0 1 -2 0 
E=/1 0 0 Ey=|0 1 0 
00 1 0 0 1 
i 0 0 1 0 0 
E=|0 1 0 E4=|0 1 1 
0 -1 1 00 1 
101 10 0 
Es=|0 1 0 Es=|0 1 0 
00 1 0 0 -i 


Answers to Odd-Numbered Exercises 


fsb] -2 
TEREE 
10 0 121 
13,.4=LU=| 2 1 0 0 1 3 
-3 0 1 0 0 1 
1 
1 0 0 1 -3 
15. 4=LU=| 1 10 01 4 
-1 —4 1 00 3 


17. e LU factorization: 


[33] [34] 


e Solve Ly = | > Jone-tor=3 
e Solve Ux = y: xı =2,x. =3 


19. e LU factorization: 


1 0 0 1 4 -3 
L= -! 1 0 U = 0 1 2 
2 0 1 0 0 1 
xı + 4x2 — 3x3 
ey=Ux= X2 + 2x3 
x3 
0 
e Solve Ly = | —3 |: yı =0,y2 = —3,y3 = 1 


1 


e Solve Ux = y: xı = 23,x2 = —5,x3 = 1 


21. èe LU factorization: 


oreo 
=O 


OoOrFN WwW 
SeenNne 


446 Answers to Odd-Numbered Exercises 


xı — 2x2 + 3x3 + x4 Section 1.8 
e y=Ux= x2 + 2x3 + 2x4 1. xy = 2,x2 = 9,x3 = 3,x4 = 9 
x3 + X4 3. Let xs = 3. Then xı = x5 =3,x7 = txs = 1,x3 = 
7 {x5 = 1,x4 =X5 = 3. 
: 5. Let x1,x2,...,x7 be defined as in the figure. 
e Solve Ly = : 
14 ý 300 
—8 atl y 
yı = 5,y2 = l, y3 = 4, y4 = —2 8004 os 4500 
e Solve Ux = y: xı = —25, x2 = —7,x3 = 6,x4 = —2 
x37 Y x6 
23. A = PLU a 
0 1 0 1 0 0 1 — 2 x4 ¢ = 4300 
=|1 00 25 0 0 1 -3 
0 0 0 0 1 = 0 1 700 Ég 
28. A= LU =| 1 d | 1 a Then x; = 1000 — x4 — x7, x2 = 800 — x6, x3 = 
=3 1 0 1 1000 — x4 + x6 — x7,x5 = 300 + x6 — x7 
i i 1 —4 1 0 Since the network consists of one-way streets, the 
AW =U L = | 0 1 | | 3 1 | individual flows are nonnegative. As a sample solution 
let x4 = 200, x6 = 300, x7 = 100; then xı = 700, 
_ | all =4 | x2 = 500, x3 = 1000, xs = 500. 
3 | Ts Xi = 150 — x4, x2 = 50 — x4 — x5,x3 = 50 + x4 + x5. 
1 0 0 > t <1 As a sample solution let x4 = x5 = 20; then 
27.A=LU=}] 1 1 0 0 1 -l xı = 130, x2 = 10, x3 = 90 
1 1 1 0 0 3 9. X] = 1.4, x2 = 3.2, x3 — 1.6, x4 = 6.2 
Asy p 0.02 0.04 0.05 
i i 11. a. A= | 0.03 0.02 0.04 
| 3 3 0 1 00 0.03 0.3 0.1 
=j 0 1 i -1 1 0 b. The internal demand vector is 
| 0o o 2 meh 300 22 
1 -l 9 A} 150 | = | 20 |. The total external demand 
s i 200 74 
=| 4 2 1 
i A for the three sectors is 300 — 22 = 278, 150 — 20 = 
0 -3 3 130, and 200 — 74 = 126, respectively. 
29. Suppose 1.02 0.06 0.06 
a 0 d e 0 1 ce @ — A)! | 0.03 1.04 0.05 
poe ozf|7]ı o0 0.05 0.35 1.13 
This gives the system of equations ad = 0,ae = 1, d. X =(I—A)'D 


bd = 1,be + cf = 0. The first two equations are 
satisfied only when a 4 0 and d = 0. But this is 
incompatible with the third equation. 


1.02 0.06 0.06 350 
0.03 1.04 0.05 400 
0.05 0.35 1.13 600 


31. If A is invertible, there are elementary matrices 


E\,...,E, such that 7 = E,---£,A. Similarly, there are 418.2 
elementary matrices D,,...,D¢ such that = | 454.9 
I = De- --DıB. Then A = E} ' -ET 'De -+ D1B, so A 832.3 


is row equivalent to B. 


13. 


15. 


17. 


a. 
no Mn on on oo 
S55 8% 238 
3,880,900a + 19706 + c = 80 
b. 4 3,920,400a + 19806 + c = 250 
3,960,100a + 19906 + c = 690 
c. a = Fb = — 881 c = 5,232,400 
d. 1400 
1200 y, 
1000 
800 
600 J 
400 
200 
0 


e. The model gives an estimate, in billions of dollars, 
for health care costs in 2010 at 


27 a 10,631 
5p (2010 — —— 2010) + 5,232,400 = 2380 


ae | 0.9 0.08 | 


0.1 0.92 
b. A 1,500,000 | _ | 1,398,000 
` 600,000 | 702, 000 
čz 1,500,000 | _ | 1,314,360 
i 600,000 | 785, 640 
n | 1,500,000 
oA | d 
The transition matrix is 
0.9 0.2 0.1 
A=] 01 0.5 03 
0 0.3 0.6 


so the numbers of people in each category after 
1 month are given by 


20, 000 23,000 
A | 20,000 | = | 15,000 
10, 000 12,000 


Answers to Odd-Numbered Exercises 447 


after 2 months by 


20, 000 24,900 
A? | 20,000 | = | 13,400 
10, 000 11,700 

and after 1 year by 
20, 000 30, 530 
A! | 20,000 | ~ | 11,120 
10, 000 8,350 

19. a. l +5 =h 

pd Ai + 3% = 8 
° 3h + 5k = 10 
hI — h + Bh = 0 
c 4 + 3b = 8 
3h + +55 = 10 


Solution: 7, ~ 0.72, b © 1.7, ~ 0.98 


21. Denote the average temperatures of the four points by 
a,b,c, and d clockwise, starting with the upper left 
point. The resulting linear system is 


4da-b — d=50 
—a +4b -c =55 

-b — d=45 
—a —c+4d=40 


The solution is a © 24.4,b ~ 25.6,c © 23.1,d ~ 21.9. 


Review Exercises Chapter 1 


2 


l 

e 
Fe NO Fe 
woe N -e 


1 
0 
2 


b. det(4) = —8 


c. Since the determinant of the coefficient matrix is not 
0, the matrix is invertible and the linear system is 
consistent and has a unique solution. 


d. The only solution is the trivial solution. 


e. From part (b), since the determinant is not zero, the 
inverse exists. 


—3 -8 -2 7 
1 5 8 6 —9 
8 5 0 =2 =1 
—4 0 0 4 


448 


Answers to Odd-Numbered Exercises 


3 11 
1 —17 

— 4- — 1 
f.x =A gc =a 7 
5 4 


.a=0,c=0,b =0;a=0,c=1,D ER; 


a=1,c=0,be€R;a=1,b=0,c=1 


= ay by = a by 
aa Pe ay 


then the sum of the diagonal entries is 


. a. If 


(aiaz + bicz) — (aiaz + b2c1) 
+ (b2c1 + didz) — (bic2 + didz) = 0 


b. a b a bj a? + bc 0 
c —a c -a| 0 a? +bc 


= (a? + bo)I 


c. Let M = AB — BA. By part (a), M? = kI for some 
k. Then 


(4B — BAPC = M?°C 
= (AI)C = C (kI) 
= CM? = C(AB — BAY 

a. Since det(4) = 1, then A is invertible. 

b. Six 1s can be added, making 21 the maximum 
number of entries that can be 1 and the matrix is 
invertible. 

a. B! = (A+ A) =A + (A) =A +4=B; 

C! =(4— A'Y = A — (A'Y =A -A= -C 


1 1 
b. A = (4+4) + (4-4) 


Chapter Test: Chapter 1 


SN E 


10. 
12. 
14. 
16. 
18. 
20. 
22. 


7d ie. o e a DT 
AeA RH HHA HBAS 


23. T 24. T 
25. T 26. T 
27. F 28. T 
29. F 30. T 
31. T 32. T 
33. F 34. T 
35. F 36. T 
37. T 38. F 
39. T 40. T 
41. T 42. F 
43. T 44. F 
45. T 
Chapter 2 
Section 2.1 
—1 
1l. u+v= 2 | =v+u 
3 
11 
3. u— 2v +3w= | -7 
0 
1 
5. —3(u+v)-w=| -7 
—8 
—17 
—14 
Te 9 
—6 
—2 
9. (x1 + x2)u = (x1 + x2) 3 
xi + Xo 
= —2x 1 — 2x2 
7- 3x1 + 3x2 
0 
xX] 
_ —2x] + 
= 3x] 
0 
= xu + XV 


11. 
13. 


15. 


17. 


19. 


21. 


23. 


25. 


27. 


29. 


v = 2e; + 4e. + €3 
v = 3e — 263 


— 
| 
N 
ao 9 
1 + 
Nw 
9 9 
N N 
Il 


Solution: cj = i. Q=- 


The vector | 
3 
R 


Cl = C2 = 3 
2c = 2c2 = 1 


Solution: The linear system is inconsistent. 


_ | is a combination of | = | and 


3 : eee 
The vector | cannot be written as a combination 


1 
1 —1 
of and 7 
—4c, = 5c3 = —3 
4cı + 3c2 + C3 = -3 
3c) -— C2 — 563 = 4 
Fe ist oe 238 4. _ 3 
Solution: ¢) = 77.0 = -Bie = TH 
The vector | —3 | is a combination of the three 
4 
vectors. 
= = cp + ce} = = 
C? N i: = 0 
C1 + C = «G3 = 2 
Solution: The linear system is inconsistent 
—1 
The vector 0 | cannot be written as a 
2 


combination of the other vectors. 
All 2 x 2 vectors. Moreover, c} = 
C2 = za + xb 


1 2 
34 = 5b, 


All vectors of the form | =e | such that a € R. 


All 3 x 3 vectors. Moreover, cı = za — <b + dc, 
1 2 1 1 1 1 
c2 = —34 + §b + 36,63 = 34 +53b- 3° 


Answers to Odd-Numbered Exercises 


449 


a 
31. All vectors of the form b such that 
2a — 3b 
a,b e€ R. 
Section 2.2 
1 1 —2 | —4 eee 1 0/2). 
la 3| Sia o | So? 
a3 3i 1-3/0]. 
Jee IARR 5 p pre 
5. Yes 
—2 1] -3 1 
3 4| 10 | — | 0 1/1 
4 2] 10 0 
7. Yes 
2 3 —2)|2 1 0 O| -4 
-2 0 0f8|—]|o 1 $ 
0 -3 -1)]2 0 0 14-4 
9. No 
1 —1 O} - 1 0 0] 0 
2 -1 1 — | 0 1 1 
—1 3 2 5 0 0 1 
11. Yes 
2 1 —1 3 1 0 0 3 
—3 6 —1 |—17 E 0 1 0; -1 
4 —1 2 17 0 0 1 2 
1 2 3 7 0 0 0 0 


13. Infinitely many ways 

cp=1lt 503502 =1+ 4C3,€3 eR 
15. Infinitely many ways 

cı = 3 + 6c4, c2 = —2 —¢4,03 = 24+ 2c4,c4 E R 
17. Yes 


1 -2 -1|-2 10 0 
2 3 3| 4] _ }0 1 0 
1 1 2| 4 00 1 
-1 4 1] 0 0 0 0 
19. No 
2 3 3| 2 1 0 0 
Sa eo ee OO oe) 
=i 2 2/-1 00 1 
3 -2 2| 2 00 0 


-1 
-1 


3 
0 


=. Ooo 


450 


21. 


23. 


25. 
27. 


29. 


31. 
33. 
35. 


37. 


Answers to Odd-Numbered Exercises 


-1 —2 
(AB) =3| 5 | +2| a 

—1 —2 
(AB), =2| JESI a 
Not possible. 


x? — 2x +1 = 4(1 +x) + 2(-x) + O@? + 1)+ 
3(2x3 —x +1) 


a 
All vectors | b 
c 


such that 3a — b + c = 0. 


v = 2v; — v2 + 4v3 


i , = gy Sp 
Since cı Æ 0, vı = ave zí Vn 
Let v € $1. Since c Æ 0, then v = cv +--+- + £ (evz), 


so v € So. If v € So, then v = c1Vı +--+ + (CC )Vk, SO 
v € Sı. Therefore, S; = S2. 


If A; = cA], then det(4) = 0. Since the linear system 
is assumed to be consistent, it must have infinitely 
many solutions. 


Section 2.3 
} — 2 ; 
1. Since 1 3/7 1, the vectors are linearly 
independent. 
3. Since | _ 4 = | = 0, the vectors are linearly 
dependent. 


-1 2 
. Since 2° 2 
1 3 


, the vectors are 


linearly independent. 


—4 -5 3 
. Since 4 3 —5 | =0, the vectors are linearly 
—1 3 5 
dependent. 
3 1 3 Bo AL 3 
= = 0 4 0 
. Since aes l — à , the 
-l1 2 0 0 1 
2e +i, 1 0 0 0 


vectors are linearly independent. 


11. 


13. 


15. 


17 


19. 


21. 


23. 


25 


27 
29 
31 
33 


35 


37 


Since 
3 0 1 3 0 1 
3 1 -1 = 0 1 -2 
2 0 -1 0 0 -3 
1 0 -2 0 0 0 

the matrices are linearly independent. 

Since 


1 0 -1 1 
—2 -l 1 1 
=2 25 =2 l 
=2 2 2. =2 


1 0 -1 1 
at, 0 -1 -1l 3 
0 0 -6 7 
0 0 oF 
the matrices are linearly independent. 
v=- 


Any set of vectors containing the zero vector is linearly 
dependent. 


a. Ad = —2A, 
b. A3 = Aj + Ad 
a#6 
1 1 1 
a. Since | 1 2 1 |= 1, the vectors are linearly 
E3 2 
independent. 
b. C] = 0, c2 = -1,¢3 =3 
1 2 0 
Since | —1 0 3 | = 13, the matrix is invertible so 
2. ode, 2 


Ax = b has a unique solution for every vector b. 


Linear independent 

Linearly dependent 

If x = 0, then cı = 0, and if x = $, then c2 = 0. 

Let x = 0, then c3 = 0. Now letting x = 1 and x = —1, 
C1 = C2 = C3 =U. 

If u and v are linearly dependent, then there are scalars 
a and b, not both 0, such that au + bv = 0. If a £0, 
then u = —(b/a)v. On the other hand, if there is a 
scalar c such that u = cv, then u — cv = Q. 


Setting a linear combination of w1, w2, w3 to 0, we have 
0 = c1W1 + coW2 + C3W3 


= civ, + (c1 + c2 + c3)v2 + (—c2 + c3)V3 


if and only if cı = 0,c1 + c2 + c3 = 0, and 
—co + c3 = Q if and only if cy = c2 = c3 = 0. 


Answers to Odd-Numbered Exercises 451 


39. Consider c}v, + c2V2 + .¢3V3 = 0, which is true if and b. det(A) = —8 
only if c3v3 = —c1Vı — c2Vv2. If c3 # 0, then v3 would c. Yes, since the determinant of A is nonzero. 
be a linear combmation: of vr and: Yz Contradicting:the d. Since the determinant of the coefficient matrix is 


hypothesis that it is not the case. Therefore, c3 = 0. 


2 ; 5 nonzero, the matrix A is invertible, so the linear 
Now since vı and v2 are linearly independent 


system has a unique solution. 
ĉi =c = 0. 11 17 7 


: . . . e x= 7V5 7 7Z=5 pwl 
41. Since A1, A2,..., An are linearly independent, if 
Ax = xjAy H +x, A, =0 oF 5 a h 
cece ies ia cae X1 2 + Xo, —1 + x3 3 = b2 
then x} = x2 =--- =x, = 0. 1 1 —1 b3 
Review Exercises Chapter 2 b. Since det(4) = 19, the linear system has a unique 
b solution equal to x = A~'b. 
1. Since iia ad — bc #0, the column vectors c. Yes 


d. Yes, since the determinant of A is nonzero, A~! 


are linearly independent. If ad — bc = 0, then the i : : i 
exists and the linear system has a unique solution. 


column vectors are linearly dependent. 


a2 01 Chapter Test: Chapter 2 
3. The determinant | 0 a 0 |=a*—a¥O if and 1. T 2. F 
Lani 3. T 4. T 
only if a # +1, and a £0. So the vectors are linearly 5, F 6. F 
independent if and only if a # +1, and a £0. 7. F 8. T 
5. a. Since the vectors are not scalar multiples of each a F 1 i F 
other, S is linearly independent. á 0: 
b. Since 11. T 12. T 
1 ila lol á 13. F 14. T 
O 1jb |—>]|0 1 b 15. F 16. T 
2 Ije 0 0|—2a+b+c 17. T 18. F 
the linear system is inconsistent for —2a + b+ 19. T 20. T 
c #0. If a = 1,b = 1,c = 3, then the system is 21. F 22. F 
Pe ie 23. F 24. F 
inconsistent and v = ; is not a linear 25. T 26. F 
combination of the vectors. 2T- T 28. F 
a 29.: T 30. F 
c. All vectors | b | such that —2a +b +c =0 31. T 32. F 
: . i 33. T 
d. Linearly independent 
e. All vectors in R? Chapter 3 
1 1 2 1 x Section 3.1 
1012 . 1. Since 
7. a. Let A= > 2 0 1 PX=] 2 pad xı x2 xı — x2 
1 1 2 3 w y |B| yo |=] zn 
Z1 22 Z1— 22 
3 
eel and x2 xı x2 —X1 
—2 y» || yı | =] y2-y 
5 Z2 Z1 a =] 


452 


11. 


13. 


15. 
17. 


19. 


21. 


23. 


Answers to Odd-Numbered Exercises 


do not agree for all pairs of vectors, the operation @ is 
not commutative, so V is not a vector space. 


. The operation © is not associative, so V is not a vector 


space. 


. Since 


c+Ho] z ]=[**er4 | 
y. y 


does not equal 


[3 ]-40L5) 


x : 
for all vectors | y | then V is not a vector space. 


. Since the operation ® is not commutative, V is not a 


vector space. 


The zero vector is given by 0 = | 9 | . Since this 


0 
vector is not in V, then V is not a vector space. 


a. Since V is not closed under vector addition, V is not 
a vector space. 


b. Each of the 10 vector space axioms are satisfied 
with vector addition and scalar multiplication 
defined in this way. 


Yes, V is a vector space. 


No, V is not a vector space. Let A = J and B = —I. 
Then A + B is not invertible and hence not in V. 


Yes, V is a vector space. 


a. The additive identity is 0 = | ; i | , and the 


additive inverse of A is A~!. 
b. If c = 0, then cA is not in V. 


1 
a. The additive identity is 0 = | 2 |. Let 
3 
l+a 
u= 2—a |. Then the additive inverse is 
3+ 2a 
l-a 
—u= 2+a 
3 — 2a 


25. 
27. 
29. 


31. 


b. Each of the 10 vector space axioms is satisfied. 


1+t 1+ 0t 1 
ec 00 2—t = 2— 0t =]|2 
3+ 2t 3 + 2(0)t 3 


Each of the 10 vector space axioms is satisfied. 

Each of the 10 vector space axioms is satisfied. 

Since (f + g)(0) = f(0) + g(0) = 1 + 1 = 2, then V is 
not closed under addition and hence is not a vector 
space. 


a. The zero vector is given by f(x + 0) =x? and 
Se +) =f t). 


b. Each of the 10 vector space axioms is satisfied. 


Section 3.2 


1. 
3. 


11. 
13. 
15. 
17. 
19. 


21. 


. The set S is not a subspace of R?. If u = 


The set S is a subspace of R?. 


The set S is not a subspace of R°. If u = = | and 


v= | E | then w+ v= | | €S. 


c = 0, then cv = | ; | €S. 


. Since 
x] yı xı + cy1 
x2 | +e] yo | = | x2 +02 
X3 y3 X3 + cy3 


and (x; + cy1) + (x3 + cy3) = —2(c + 1) = 2 if and 


only if c = —2, so S is not a subspace of R°. 
. Since 
s — 2t x — 2y 
s +c x 
t+s y +x 


(s +cx)— 2(t + cy) 
= S+0X 

(t +cy)+ (s +cx) 
is in S, then S is a subspace. 
Yes, S is a subspace. 
No, S is not a subspace. 
Yes, S is a subspace. 
Yes, S is a subspace. 
No, S is not a subspace since x? — x? = 0, which is not 
a polynomial of degree 3. 


Yes, S is a subspace. 


23. 
25. 


27. 


29. 


31. 


33. 


35. 


37. 


39. 


41. 


43. 


45. 


No, S is not a subspace. 


Since 
1 -1 -l 1 1 -1 -1 1 
1 -1 2)/-1];—7> ] 0 0 1 
0 1 0 1 0 0 3a 2) 
the vector v is in the span. 
Since 47. 
1 0 1 | —2 1 0 1 | —2 
1 1 -1 1 ut 0 1 -2 3 
0 2 —4| 6 00 o| 0 49. 
-1 1 -3 5 0 0 0 0 


453 


Answers to Odd-Numbered Exercises 


—1 0 
then S = span 1 |,] —5 
2 3 


Therefore, S is a subspace. 


c. Yes, the vectors are linearly independent. 


d. SAR? 
Since A(x + cy) = l +e ARAE: 

ONAN D AE 
only if c = 0, then S is not a subspace. 


Let B1, B2 € S. Since 


| if and 


A(Bı + cB2) = AB, + cAB2 


the vector v is in the span. = BA + c(B2A) 
Since = (Bı + cB2)A 
ci(l +x) + eae? — 2) + c3(3x) = 2x — 6x — 11 then B; + cB, € S and S is a subspace. 
implies cy = —7, c2 = 2, c3 = $, the polynomial is in 
the span. Section 3.3 
1. The set S has only two vectors, while dim(R?) = 3. 
an(S) = ; ee 3. Since the third vector can be written as the sum of the 
sp = 5 C GE, first two, the set S is not linearly independent. 
5. Since the third polynomial is a linear combination of 
5 a b PER the first two, the set S is not linearly independent. 
SHAN = ett aat Sa 7. The set S is a linearly independent set of two vectors 
in R?. 
c 2 3 aS 
span(S) = {ax + bx + c| SE 0} 9. The set S is a linearly independent set of three vectors 
3 in R3. 
a. span(S) = b abeR 11. The set S is a linearly independent set of four vectors 
b—2a in Moy. Since dim (M2x2) = 4, then S is a basis. 
3 
13. The set S is a linearly independent set of three vectors 
b. Yes, S is linearly independent. in R? and so is a basis. 
a. span(S) = R? 15. The set S is linearly dependent and is therefore not a 
b. Yes, S is linearly independent. basis for R4. 
a. span(S) = R? 17. The set S is a linearly independent set of three vectors 
b. No, S is linearly dependent. in Pz so S is a basis. 
c. span(7) = RÌ; T is linearly dependent. 19. A basis for S is B = 1 1 ad 
d. span(H) = R3; H is linearly independent. 
a. span(S) = P2 dmit) =2, l 
b. No, S is linearly dependent. 21. A basis for 5 is 
ce. 2x? + 3x +5 = 2(1) — (x — 3) + 2x? + 2x) B= {| 1 0 | | 0 1 | | 0 0 |} 
d. T is linearly independent; span(T) = P3 0of1 ofp[o0 1 
a-b Since and dim(S) = 3. 
=s =l 0 23. A basis for S is B = Tar y and dim(S) = 2. 
s— 5t J +| -5 25. The set S is already a basis for R? since it is a linearly 
2s + 3t 3 


independent set of three vectors in R?. 


454 Answers to Odd-Numbered Exercises 


27 


29. 


. A basis for the span of S is given by 
2 0 —1 
B= —3 |,| 2 h| -1 . Observe that 
0 2 0 
span(S) = R°. 


A basis for the span of S is given by 


2 0 4 
B= —3 |,|] 2 |,] 0 . Observe that 


0 2 4 
span(S) = R°. 
31. A basis for R? containing S is 
2 1 1 
B= -1 |,| 04, 
3 2 0 
33. A basis for R* containing S is 
1 3 1 0 
—1 1 0 0 
po ° PT py op. 
4 2 0 0 
35. A basis for R? containing S is 
= 1 
B = 1 |; ,| 0 
0 
37. B = {ej | 1 <i<n} 
43. dim(W) = 2 
Section 3.4 
: 2 
1. [v]s = | 1 | 
2 
3. [v] = | -1 
3 
5 
5. [v]z = 2 
—2 
—1 
2 
7. [V]z = 2 
4 
el 1 
9. Was] 7 hoes] 7 
8 2 


= 


17. 


19. 


21. 


23. 


5. 


[Z] 


7] 


[v] 


B= 


732 — 
4B; g 


“1 


3 49:51 
B 2 
1 
o -i o0 
-1 
ls, =p] 1 
0 
n 0 0 1 
g=|10 0 
01 0 
5 
z By 
n = 0B = | 2 
3 
a —a—b+c 
b = a+b 
c jz a+2b—-—c 
01 0 
Uig=]| 1 0 0 
00 1 
f 2 
[vls =U 1g? =| 1 
3 
i 1 
me =| 4 


d. 4 
os 
P -1 -l 0 
25. ali = 2 2 -1 
0 -=l 1 
Š 2 
b. [2u; — 2u2 + ug], = Up; =3 | = 
1 
Section 3.5 
1. a. yi = e”, y = e” 
ex ex 
b. W[y1,y2]() = rer 3ed =e" 
e y(x) = Cie + Coe 
3. a y = e* y =xe ™ 
trpa % ae 
2 yol(x) = 
132 —2e7™ eo ** — We 
e-* > 0 for all x. 


9. 


e y(x) = Cye7>* + Coxe 


. V(x) = e* + 2xe* 
< a. yox) = Cie™ + Cre* 


b. a=1,b =3,c =4 
y(x) = 4 cos (8x) 


Review Exercises Chapter 3 


1. 


3. a. Since S is closed under vector addition and scalar 


k £69 


multiplication, S is a subspace of Mzx2. 
b. Yes, let a = 3,b = —2,c = 0. 


samo M a 


d. The matrix | > i | is not in S. 


space V. 


b. The set W is not a basis for V since it is not linearly 


independent. 


> 0 for all x. 


. a. The set T is a basis since it is a linearly independent 
set of three vectors in the three-dimensional vector 


Answers to Odd-Numbered Exercises 


. Since vı can be written as 


—c2 —C3 
v= (=)u+ (E) 
c1 Cc) 


then 
V = span{v2,V3,... 


Vn} 


455 


a. The set B = {u,v} is a basis for R? since it is 
linearly independent. To see this, consider 


au+bv=0 


Now take the dot product of both sides with first u, 


then v, to show that a = b = 0. 


b. If [wlz = | 5 |: then 


and 


u2 


Chapter Test: Chapter 3 


F 


SS ee ee) 


xv2 — Yq 


Uuiv2 — V12 


yu, — Xu? 


| uiv — viu 


7d ieo e e e r r e e e He TT A 


456 


Answers to Odd-Numbered Exercises 


Chapter 4 


Section 4.1 


1. 


. T is not linear. 


19. 


21. 


23. 


25. 
27. 


29. 


31. 


33. 


T is linear. 


. T is linear. 
. Since T(x +y) # T(x) + T(y) for all real numbers x 


and y, T is not linear. 


. Since T(cu) # cT(u), T is not linear. 

. Since 7(0) Æ 0, T is not linear. 

. T is linear. 

. Since T(cA) = c?T(A) £ cT(A) for all scalars c, T is 


not linear. 


a rw) =| 3 |: t= zj 


b. Yes 
c. Yes 


a. T(u) = | i Jit = =| 
ar] 
-1 


T(u)+ T(v) = | a | 


b. No. T(u + v) = | 
c. No, by part (b). 

[1 5 
eE 
x?) = -1 

[ 3 22 
Ul aera 
a. No. The polynomial 2x? — 3x + 2 cannot be written 

as a linear combination of x7, —3x, and —x? + 3x. 


b. Yes. T(3x? — 4x) = $x? + 6x — B 
-1 0 
a a=] 0 i 


b. T(e)) = | T | and T (e2) = | = |; Observe 


4x + 4x? 


that these are the column vectors of A. 
0 


T 0 =| 6 |-foratz er 


Zz 


a. The zero vector is the only vector in R? such that 


= 
S 


35. 


37. 


39. 


41. 


43. 


ia | cTi(v) + Tw) | 


cT7(v) + T2(w) 


T\(v 
=el ao || 
cT(v) + T(w) 
T(kA +C) = (KA + C)B — B(kA + C) 
= kAB — kBA + CB — BC 
= kT(4)+ T(C) 
a Tif +9)= | [SOO] 


0 


[IOa f ee a 
e [Iod [sco ar 


cT(f) + T(g) 


b. T(2x? -x +3) = 2 

Since neither v nor w is the zero vector, if either 

T(v) = 0 or T(w) = 0, then the conclusion holds. Now 
assume that T(v) and T(w) are linearly dependent and 
not zero; then there exist scalars ag and bo, not both 0, 
such that aoT (v) + boT(w) = 0. Since v and w are 
linearly independent, then aov + bow Æ 0 and since T 
is linear, then T(agv + bow) = 0. 

Let T(v) = 0 for all v in R3. 


Ti (w) | 
Ta(w) 


Section 4.2 


1. 


3. 


Since T(v) = | ; [vs in N(T). 


Since T (v) = | 2 [vs not in N (T). 


10 


5. Since T(p(x)) = 2x, p(x) is not in N (T). 
7. Since T(p(x)) = —2x, p(x) is not in N (T). 


17. 


. Since T 2 


-1 
=v, vis in R(T). 
1 


. The vector v is not in R(T). 


. The matrix A is in R(T) witha = 1,b = 0,c = —2, 


d=-l. 


. The matrix A is not in R(T). 


el 


2 1 
21. 1 |,| 0 
0 1 
23. eee a 
1 1 2 
25. Oj) 1 |,| -1 
2 0 1 
1 0 
27. 0 j,} 1 
0 0 
29. {1,x,x°} 
—6 
31. a. No, 5 | is not in R(T). 
0 
—2 0 
b. ; 
c. Since dim(N (T)) + dim(R(7)) = dim(R?) = 3 and 


dim(R(T)) = 2, then dim(N(T)) = 1. 


33. a. The polynomial 2x? — 4x + 6 is not in R(T). 
b. {-2x + 1x? +x} = {T(x), T(x’)} 
i x 
35. T = 
: | y | 


37. a. The range R(T) is the subspace of P, consisting of 
all polynomials of degree n — 1 or less. 


dim(R(T)) =n 
dim(N(T)) = 1 
dim(R(T)) = 2 
dim(N(T)) = 1 


{fo oflo 1 |} 


43. a. The range of T is the set of symmetric matrices. 


39. 


Spe 


4 


ay 


b. The null space of T is the set of skew-symmetric 
matrices. 


45. If the matrix A is invertible, then R(T) = My xn. 


Section 4.3 
1. T is one-to-one. 
3. T is one-to-one. 
5. T is one-to-one. 
7. T is onto R?. 
9. T is onto RÌ. 


Answers to Odd-Numbered Exercises 457 


11. Is a basis 
13. Is a basis 
15. Is a basis 
17. Is a basis 
19. Is a basis 


21. a. Since det(A) = det (| A o }) =-30, 


2 -3 
then T is an isomorphism. 
-3 0 
=f 
b. A =—3 | > 1 | 


O 
i 
g 
-——— 
a 
m~ 
=~ x 
es) 
Me 


Il 
WIN = 
u= © 
<a) 
p 

N 
x 
= 
U 
= 
——<—<——_ } 


ll 
= 
<O& 
i 


23. a. Since 
—2 0 
det(A) = det 1 -1 -l1 
0 1 0 
=-140 
then T is an isomorphism. 
-1 -1 -l1 
b. A7! = 0 0 1 
-1 -2 -2 
x 
e ATT|| y 
Z 
1 1 1 2x +z 
= 0 0 1 x-y-z 
-1 -2 -2 y 
x 
| y 
Zz 


25. T is an isomorphism. 

27. T is an isomorphism. 

29. Since T(cA + B) = (cA + BY =cA' + B' =cT(A) 
+T(B), T is linear. Since T(A) = 0 implies that A = 0, 
T is one-to-one. If B is a matrix in Mpxn and A = B‘, 
then T(A) = T(B*) = (B')' = B, so T is onto. Hence, 
T is an isomorphism. 

31. Since T(kB + C) = A(kB + C)A7~! = KABA“! 4 
ACA7! = kT(B) + T(C), T is linear. Since T(B) = 
ABA~! = 0 implies that B = 0, T is one-to-one. If C is 


458 Answers to Odd-Numbered Exercises 


a matrix in Maxn and B = A~'CA, then 
T(B) = T(A7!CA) = A(A7!CA)A7! = C, so T is onto. 
Hence, T is an isomorphism. 


33. 


a 
T : = ax? + bx? +ex+d 
d 
35. Since 
x 
V= y x,y ER 
x + 2y 
define T: V > R? by 
6 x 
ESEE 
x + 2y Á 


37. Let v be a nonzero vector in R°. Then a line L through 
the origin can be given by 


L= {tv| te R} 


Now, let T: R? — R? be an isomorphism. Since T is 
linear, T (tv) = tT (v). Also, by Theorem 8, T (v) is 
nonzero. Hence, the set 


L' = {tT(v)| t € R} 


is also a line in R? through the origin. The proof for a 
plane is similar with the plane being given by 


P = {su + ty| s,t € R} 
for two linearly independent vectors u and v in R. 
Section 4.4 


1. a. [T] = | 5 


11. 


f, 1 1 
.a (TZ =| 0 -1 -2 
0 0 1 


b. T(x? — 3x +3) =x? — 3x + 3; 
[Ta — 3x +3)] p = [THB [x? — 3x + 3]e 


fi 3 
=[ |1 |=| -3 
1 1 
T(x? — 3x + 3) = 3 — 3x +x? 
a b 0 —2b 
waz |ie b laaro], A 


1 1 
13. a. Te =| | 2 
A 
biia 
9| 11 
, 1f5 
BE eo 
e Ti =5] 1 
lf 5 
B _ L 
a (l= 3] i 
spl 1f -2 
SMES S| 5 
, 1f 22 
WB ie act, ost 
£ ITI = 5] < 
0 0 
15. a. [TĘ =| 1 0 
0 5 
0 0 
b. [TŻ =| 0 1 
1 
1 0 
0 1 
e (TIS =| 0 0 
1 
5 0 
0 1 
B c 
a. sg-=| 9 0 


e. [SB [TE = | 


[TŻ SiS 


f. The function S o T is the identity map; that is, 
(S o T)\(ax +b) = ax +b so S reverses the action 


of T. 


17. [T]g = | 5 E | 


The transformation T reflects a vector across the x-axis. 


19. [T] = cI 


M) 


0 


0 
0 
1 


Answers to Odd-Numbered Exercises 


21. [TZ =[1001] 


23. a. [2T + S]g = 2[T]g + [S]; = | = 2 | 


b [55 


25. a. [S © Tle = [S]s[T]s = | > | 


» | o | 


3 3 1 
27. a. [-3T +38 p= | 2 -6 6 
ae | 
3 
b. | —26 
—9 
4 —4 —4 
29. a. [S > T]g = t st <1 
= a S | 
—20 
b. | —5 
5 
0006 0 
0000 24 
31. [T=|0 000 0 
0000 0 
0000 0 
=j 
—4 
[TP@)]zg = 


Oooo N 
EEE 


T(p(x)) = p(x) = —12 — 48x 
0 0 0 
100 
Ber 
33.188 =| 5 1 9 
001 
0 1 0 0 
[Dz =|0 0 2 0 
0 0 3 
10 0 
D S} =| 0 2 0 |=[T]s 
0 0 3 


459 


460 Answers to Odd-Numbered Exercises 


35. If A= | f | then the matrix representation for 
T is 
0 -c b 0 
b a-d 0 b 
[Ts = 0 d-a —c 
0 c —b 0 
1 1 0 0 0 
0 1 1 0 0 0 
00 1 1 0 0 
37. [T]; = : 
000 0 0 1 0 
000 0 0 1 1 
000 0 =O 1 
Section 4.5 


mmom=[_1 3 ][ t]-[3 
[isiin = | k 5 | E | -| 5] 


To show the results are the same, observe that 
[1 -1 2 
aaee 

1 1 2 0 
3. a. ie, = | ae a 
1 1 
[T]; [v], =| oe | | E | 


1 
mewe [e oJ- 
2 


To show the results are the same, observe that 


eo ea 


N vw 
ll 
-— 


1 0 0 
5. a. (Play = | 0 0 0 
0 0 1 
1 -1 0 
[Tln = | 0 0 
0 11 
b 1 0 0 1 1 
[T]z,[vlz, =| 0 0 0 2 |= 0 
0 0 1 —1 —1 


1 —1 0 3 1 
[Tle,[vle, =] 0 0 0 2ļ=] o0 
0 11 —4 -2 


To show that the results are the same, observe that 


1 =i 0 1 
ıl ofl+0) = Sh Pe SO sie | ae o0 
1 0 1 -1 


3 -I 
pes = | al 


[T]s, =P '[T]e,P 


saral a a a 


Nig 


B 
9. P = [r]! 


Wl Whe 
| 

=. = 

L 


v= Pirie 


ll 
Nw 
| 
NIW 
Fä 
O = 
| 
- oO 
—— 
| 
WI Whe 
| 
=. = 
b 


v= © n= 


-| 


B 2. "1 
PSU's | 


Pin = P'[T]s,P 
_f 2 -1 
Fu ES 2. 
_f -1 -2 
7 6 6 
1 -l 
—2 1 
B —]1 -l 
P= (Np, =| 2 1 | 
[T]s, = P'[T]s,P 
f1 1 1 -1 -1 -1 
|) aS E —2 1 2 1 
[1 1 
S T 


| 
p 
oN 
wo 
| 
p=] 
W N 
Noe 
| 


13. [T]z, = | 


Answers to Odd-Numbered Exercises 461 


0 1 0 a [4 0 
15. [T] =| 0 0 2 erg =|} 2 
0 0 0 
ee —J2/2 2/2 
0 2 0 Sa s=| _ a yap 
[T] =|] 90 0 1 
00 0 b. AD 
If a 
1 0 -2 T 
P=—fi=|0 2 0 ii 
= [I]; = al 
0 0 1 
TARE ay Hie 
then [7], = P7'[T],P. Y m 
17. Since A and B are similar, there is an invertible matrix + 
P such that B = P~'AP. Also since B and C are zd 
similar, there is an invertible matrix Q such that 
—o-! — 0-1p-! — 
C =Q BQ. Therefore, C = Q~ P APQ = eve -2/2 -/2/2 
(PQ) A(PQ) so that A and C are also similar. S J2/2 -2/2 
19. For any square matrices A and B the trace function 
satisfies the property tr(4B) = tr(BA). Now, since A 3/2 —1/2 3/2 —1/2 
and B are similar matrices, there exists an invertible 7. a. 1/2 3/2 J3/2+1/2 
matrix P such that B = P~!A4P. Hence, 0 0 1 
tr(B) = tr(PT!AP) = tr(APP') = tr(A) b. Ay 
21. Since A and B are similar matrices, there exists an aj 
invertible matrix P such that B = P~!AP. Hence, + J 
a —1 n —_ p-lyn T 
B" =(P APY =P AP THN Z 4 
Thus, A” and B” are similar. T i 
Section 4.6 1 
r —5 e 
1. a. o S | 
- JV3/2 1/2 -1 
j eo e | —1/2 3/2 -1 
“{ 0 1 0 0 1 
[1 0 | ff 0 1] [o] 
c 9. a. = 
| 0 3 LL OJ], L0] 
3 0 ff 2] [2] 
3. a. [T]s = | 1 | = 
0 =z [L2 ]j L0] 
b. A y pi 0 1] = [ ] 
10+ LL2 dle L1] 
Í 1 1] 
E R S _ 
T b. [T]} = | p =i | 


462 Answers to Odd-Numbered Exercises 


E cise tel ee -1 2 
-1 1| | 0 c 5 —4 
h. [T] = 7 5 
a) —2 4 
5+ 
L -1 2 
Z l=} 3 Ib | 
4 Vie 7 -5 Y Jls 
Ss =9 4 
| —1 2 
t 5 4 | ix +4 | 
—5+ = = 2 1 
ee 
The original triangle is reflected across the line 
y=x. x—y 
Be tee _ | —*+3y 
d. 0 1 0 _ 0 ~ | =x +4y 
| 1 O} | O | | 0 | 2x — 2y 
0 1 2 = 2 x 
| 1 Of} | 2 | | 2 | x A 
- eat fet This implies tat | | = 4 
0 1 oO} _ |} 2 x-y 
|} 1 Of}; | 2] [0] 2y 
x x x =x 
Review Exercises Chapter 4 3. a. S | y | = | -y F | y | = | y | 
1. a. The vectors are not scalar multiples, so S is a basis. 
b. [S]; = 1 0 Tle = -1 0 
x ete jo 1f? | o1 
x x+y 
b. T = —1 0 
H x—y eT °sh=| 0 {|= oth 
2y 
The linear operators S o T and T o S reflect a vector 
e N(T) = {0} through the origin. 
d. Since N(T) = {0}, T is one-to-one. 0 1 
5. a Te =| | 
1 0 1 0 
1 1 
e. 3 B' 1 0 
ile) 2a bing’ =| 5 4 | 
2 2 
f. No, T is not onto since dim(R(7)) = 2 and aa 
. No, T is not onto since dim = 2 an 73.a. [T=] 0 0 1 
a 0 1 0 
dim(R*) = 4. Also f is in R(T) if and only =| -] =j 
d b. |T 2 | = 1 | =T 2 
1 2 1 
ifc +b — 2a = 0. B 
ia Ie 1] fo 0 
0 1 0 1 ce NT)=4 | 0 
g- 1, orlo rlo 0 


1 
e. [T"] =| 0 
0 


9. Since T? — T +I = 0,T — T? = I. Then 
(Te — TO) =T —T)v)) = Tv — T(v)) 
= T(v)—T*(v) =I (v) =v 


re oS 


0 
1 
0 


Chapter Test: Chapter 4 


1. F 


N 
m 
mammana 


Chapter 5 
Section 5.1 
1. \x=3 
3. x =0 


5S =l 


7. a. 7+5r.=0 


b. X1 = 0, x2 = —5 


ev=] 


n 


P Pos 


10. 
12. 
14. 
16. 
18. 
20. 
22. 
24. 
26. 
28. 
30. 
32. 
34. 
36. 
38. 
40. 


Se e e: e: e e ie ee ee | 


9. 


11. 


13. 


15. 


Answers to Odd-Numbered Exercises 


d. -2 2 1] [0] of! 
3 =3 ai A 1 
—2 2 2 
3 -3 3 
a. (A — 1)? =0 
b. 4, =1 
efit tal 
C. vV = 0 
d. 1 -2 i SASA i a ee 
0 1 oj lol” 
a. (A+ 1704-1) = 0 
b. 4) =—1, 02 =1 
1 1 
ev= | 0 l,v=]|2 
0 2 
d. [ -10 1 1 -1 
01 0 0 |= 0 
0 2 -1 0 0 
-1 0 1 1 1 
0 1 0 2/=| 2 
0 2 -1 2 2 
a. (A — 2)(h — 1)? =0 
b. 4) =2, 22 = 1 
1 = 
GQ vV = 0 »v= 1 
0 1 
d. Ie 17-2 1 2 
0 2 -1 oļ=]l0 
01 0 0 0 
2 de 2.2 -3 - 
0 2 -1 1ļ= 1 
01 0 1 1 
a. (A+ IQA — 2). 4+ 2)(2 — 4) = 0 
b. Ay =—1, ko =2, 43 = —2, M4 = 4 


0 

1 
SA c= 0 »V3= 

o} | 


7 mt 
oro°9e 


463 


464 


17. 


19. 


21. 


23. 


25. 


27. 


29. 


Answers to Odd-Numbered Exercises 


-1 0 0 0 1 -1 
02 0 o}|]o]}]_} o0 
o0 -2 oļļlo]7] o 
00 04j[o0 0 
1 
0 
=-1| o 
0 


The other cases are similar. 


Let A = | y s | The characteristic equation is 


(a — 4)(d — X) — be = 0, which simplifies to 

X? — (a + d) + (ad — bc) = 0. Observe that the 
coefficient of ~ is —(a +d), which is equal to —tr(A). 
Also, the constant term ad — bc is equal to det(A). 


Suppose A is not invertible. Then the homogeneous 
equation Ax = 0 has a nontrivial solution xo. Observe 
that xo is an eigenvector of A corresponding to the 
eigenvalue ^ = 0 since Axo = 0 = Oxo. On the other 
hand, suppose that > = 0 is an eigenvalue of A. Then 
there exists a nonzero vector xg such that Axo = 0, so A 
is not invertible. 


Let A be such that A? = A, and let be an eigenvalue 
of A with corresponding eigenvector v so that Av = Xv. 
Then A*v = MAy, so Av = 27v. The two equations 


Av =v and AV =v 


imply that X?v = Xv, so that (à? — X)v = 0. Since 
v £0, then \(, — 1) = 0, so that either >x = 0 or } = 1. 


Let A be such that A” = 0 for some n, and let i be an 
eigenvalue of A with corresponding eigenvector v, so 
that Av = Xv. Then 4?v = Av = 2v. Continuing in 
this way, we see that A”v = "Vv. Since A” = 0, then 
A”v = 0. Since v Æ 0, then X” = 0, so that > = 0. 


If A is invertible, then 
det(4B — XJ) = det(47!(4B — 4J)A) 
= det(BA — XJ) 
Since 
det(4 — M) = (4 — a11) — 22) +++ OW — am) 
the eigenvalues are the diagonal entries. 


Let be an eigenvalue of C with corresponding 
eigenvector v. Let C = B7!AB. Since Cv = y, then 
B-!ABy = iv. Then A(Bv) = X(Bv). Therefore, Bv is 
an eigenvector of A corresponding to i. 


31. 


33. 


35. 


Ler | * |= 
y 


and } = —1 with corresponding eigenvectors | 0 | 


| Bi | . The eigenvalues are } = 1 


1 
If 0 #0 or 0 Æ x, then T can only be described as a 


and | : | respectively. 


A x 
rotation. Hence, T | r | cannot be expressed by scalar 


multiplication as this only performs a contraction or a 
dilation. When 0 = 0, then T is the identity map 


r[ |= 
X 
eigenvector with corresponding eigenvalue equal to 1. 


o alih 


0 -l1 
In this case every vector in R? is an eigenvector with 
eigenvalue equal to —1. 


x : í ; 
| | . In this case every vector in R? is an 
y 


Also, i6 = x, then T | > | =| 


1 1 
-3 37 0 
a. [T] =| -4 4 0 
-1 -1 
1 10 
b. [T] =| -1 —1 0 
-1 01 


c. The characteristic polynomial for the matrices in 
parts (a) and (b) is given by p(x) = x? — x?. Hence, 
the eigenvalues are the same. 


Section 5.2 
1 0 
=] Tr 

1. PAP = | 0 3 | 
0 0 0 

3. P'AP=| 0 -2 0 
0 O 1 

5. Eigenvalues: —2,—1; A is diagonalizable since there 


11. 


are two distinct eigenvalues. 


. Eigenvalues: —1 with multiplicity 2; eigenvectors: 


| o | ; A is not diagonalizable. 


. Eigenvalues: 1,0; Æ is diagonalizable since there are 


two distinct eigenvalues. 


Eigenvalues: 3, 4, 0; A is diagonalizable since there are 
three distinct eigenvalues. 


13. 


15. 


17. 


19. 


21. 


23. 


25. 


27. 


Eigenvalues: —1 and 2 with multiplicity 2; 


1 —1 
eigenvectors: | —5 |,| —1 |; 4 is not 
2 1 


diagonalizable since there are only two linearly 
independent eigenvectors. 


Eigenvalues: 1 and 0 with multiplicity 2; eigenvectors: 


—1 0 0 
1 |,} 1 |,| 0 |; A is diagonalizable since 
1 0 1 


there are three linearly independent eigenvectors. 
Eigenvalues: —1,2,0 with multiplicity 2; 


0 0 0 -1 

eigenvectors: = l z ot; 
O A 0 / 1/7 
0 3 1 0 


A is diagonalizable since there are four linearly 
independent eigenvectors. 


-3 0]. ao | 2 0 
p=| Me ape d 
02 0 
P=|1 1 1 
163.2 
-1 0 0 
P'AP = 0 1 0 
0 0 0 
2 0 0 
P=|1 10 
0 0 1 
-1 0 0 
P-'4P = 0 1 0 
00 1 
-1 0 -1 1 
01 00 
= 00 11 
10 0 0 
100 0 
0 10 0 
-1 = 
BAA ò 0o00 
0 0 0 2 
By induction. If k = 1, then 4* = 4 = PDP™! = 


PD*P~'. Suppose the result holds for a natural number 
k. Then 


w 


33. 


35. 


37. 


1. 


Answers to Odd-Numbered Exercises 465 
Akt = (PDp!)*+! 

= (PDP DÝ (PDP!) 

= (PD*PT!(PDP™}) 

= (PD*)\(P~'P)(DP') 


= Pp*t!p-! 
1 0 1 00 0 
.P=|1 -2 2|;D=|01 0|; 
1 1 0 0 0 1 
3 -1 -2 
Ak=pPp'P'=|2 0 -2 
a | 


Since A is diagonalizable, there is an invertible P and 
diagonal D such that A = PDP~!. Since B is similar to 
A, there is an invertible Q such that B = Q~'AQ. Then 


Do P ORO P =(Q"'P)'BO™'P) 


If A is diagonalizable with an eigenvalue of multiplicity 
n, then A = POD)PT! = (AD)PP7! = XJ. On the other 
hand, if A = XJ, then A is a diagonal matrix. 


01 0 
a. [T] =| 0 0 2 
00 0 
1 12 
b. [T], =| -1 -1 0 
0 0 0 


c. The only eigenvalue of A and B is ^ = 0, of 
multiplicity 3. 
d. The only eigenvector corresponding to ^ = 0 is 


-1 
1 |, so T is not diagonalizable. 
0 
If B is the standard basis for R?, then 
2 2 2 
[Tle = | -1l 2 1 
1 -1 0 
The eigenvalues are >; = 1, multiplicity 2, and 42 = 2 
0 
with corresponding eigenvectors | —1 | and 
1 


—1 |, respectively. Since there are only two 


linearly independent eigenvectors, T is not 
diagonalizable. 


466 Answers to Odd-Numbered Exercises 


39. Since A and B are matrix representations for the same 
linear operator, they are similar. Let Æ = Q~'BQ. The 
matrix A is diagonalizable if and only if D = P~'AP 
for some invertible matrix P and diagonal matrix D. 
Then 


D = P™!(Q7'BQ)P = (OP) 'B(OP) 


so B is diagonalizable. The proof of the converse is 


identical. 
Section 5.3 BN S 
1. yı (t) = [y1 (0) + ya(O)Je* — y2(0)e 
y2(t) = y2(0)e 7” 
1 
3. x(t) = 5V0) — y2(0)Je* 
1 
+ zD + y2(0)]e 7” 
1 
y2(t) = zO + y2(0)]e” 
1 
+ 37100) + y2(0)]e 7” 
5. yi(t) = [2y (0) +20) + y3(0)]e ~ 


+ [=v1(0) — y2(0) — y3(0) Je” 
y2(t) = [-21(0) — 20) — 2y3(0)]e“ 
+ 2[y1(0) + y2(0) + y3(0)]e” 
y(t) = [-2y1(0) — y2(0) — y3(0)]e™ 
+ [2y1(0) + y2(0) + 2y3(0)]e’ 
7. yi(t) =e, yo(t) = —e™ 
9. a. yit) = -51 + BM» 
AO = 1- wy 
yı(0) = 12, y2(0) = 0 
b. yi(t) = 4 + 8e7 0", yo(t) = 8 — Be 70! 


c. lim; y(t) = 4, lim; >œ yo(t) = 8 
The 12 lb of salt will be evenly distributed in a ratio 
of 1:2 between the two tanks. 


Section 5.4 


ra ee | 0.85 0.08 | 


0.15 0.92 


0.7 0.37 
10 x 
nea oe | 


0.35 
© | 0.65 


0.1 02 0.7 


0.5 04 01 
3.T=)} 04 04 0.2 


0 0.36 
T?] 1] =} 0.35 
0 0.29 


0 0.33 
Te) 1 | & ]| 0.33 
0 0.33 


0.5 0 0 


.T=| 05 0.75 0 


0 0.25 1 


The steady-state probability vector is | 0 |, and 
1 
hence the disease will not be eradicated. 


0.33 0.25 0.17 0.25 
_ | 0.25 0.33 0.25 0.17 


ea T=) 917 025 033 0.25 
0.25 0.17 0.25 0.33 
1 0.5(0.16)" + 0.25 
0 0.25 
b T| o —0.5(0.16)" + 0.25 
0 0.25 
0.25 
«| 025 
"| 0.25 
0.25 


. Eigenvalues of T: 4}; = —q +p4+1,d2 = 1, with 


corresponding eigenvectors | É | and | q/p |: 


1 
The steady-state probability vector is 


l Bae pa | 
l+q/p| |} a 


Review Exercises Chapter 5 


Se eee 


= (a+) | 


b. hp =atb,d.=a-—b 


en- [i Je [5] 


1 -l1 at+b 0 
ar=|i “ipost a-s] 


. a hy =0; A= 1 


b. No conclusion can be drawn from part (a) about the 
diagonalizability of A. 


0 |= 
C. k= 0 vS 0 = ? 
0 


re oO 
< 
N 

| 
— 


ho = l: v3 = 


oore 


d. The eigenvectors {v,,V2, v3} are linearly 
independent. 

e. A is not diagonalizable as it is a 4 x 4 matrix with 
only three linearly independent eigenvectors. 


ay 1 0 
a de(4-M)=| 0 —-\ 1 
= ae 


=-3.4+k =0 


. a. Let v = _ |. Then 


AV = : SN 
A 1 
so à is an eigenvalue of A corresponding to the 
eigenvector v. 


b. Yes, since A and A’ have the same eigenvalues. 


Chapter Test: Chapter 5 


F 


ee 


Chapter 6 


Section 6.1 


1. 
3. 
5. 


11. 


15. 


17. 


Answers to Odd-Numbered Exercises 


10. 
12. 
14. 
16. 
18. 
20. 
22. 
24, 
26. 
28. 
30. 
32. 
34. 
36. 
38. 
40. 


eA SB 


AXA THA TDA A MHA MH ee BH HH AS 


467 


468 Answers to Odd-Numbered Exercises 


19. vj Lvo; vj Lv4; vj Lvs; V2Lv3; v3-Lv4; v3_L v5 


21. Since v3 = —vj, the vectors vı and v3 are in opposite 
directions. 
2 
23. w= 
Ay 
5L 
d u 
=m A >c 
=3 wv 


1 5 
27.w=-| 2 
6 
1 
u 
w 
v 
29. Let u be a vector in span{u;,U2,---,u,,}. Then there 
exist scalars c1, C2,- -,Cn such that 


uU = C1 U] + C202 +--+ H Chuy 


31. 


33. 


35. 


37. 


Then 


vu = v-(cyuy + C202 + +: + + Crun) 
= C1 VU] + CVU? + -+ + Cy VUy 
= c1(0) + c2(0)+ -- -+ ca (0) = 0 


Consider the equation 
Civ + C2V2 +--+ +CnVn = 0 
Since 
Vi-(civi + C2V2 +--+ +¢nVn) =v - 90 
so 
CLVIV] + CoV V2 +++ + Cn Vi Yn = 0 
Since S is an orthogonal set of vectors, this equation 


reduces to 
2 
ci||vi ll” = 0 


and since ||vi|| 4 0, then cı = 0. In a similar way we 
have cp = c3 = --- = Cy = 0. Hence, S is linearly 
independent. 
Since |Ju||? = u-u, 
Ju + v||? + [lu — vil? = (u + v)-(u + v) 
+ (u — v) (u — v) 
= u-u + 2u-v + v-v 
+u-u—2u-v+v-v 
= 2|Iul|? + 2IIvIl? 


If the column vectors of A form an orthogonal set, then 
the row vectors of A’ are orthogonal to the column 
vectors of A. Consequently, 


(A'A =O if Fj 


On the other hand, if i = j, then (4'4); = ||Aj]||’. 
Thus, 


|Au||? 0 Pe 0 
saa} 0 IAP 0 
0 z 0 
0 Ei 0 [Anll 


Suppose that (Au)-v = u-(Av) for all u and v in R”. By 
Exercise 36, 
u-(Av) = (4'u)-v 


and by hypothesis 
u-(Av) = (Au)-v 


for all u and v in R”. Thus, 


(A'u)-v = (Au)-v 


for all u and v in R”. Let u =e; and v = ej, so 
(A');; = Ay. Hence A’ = A, so A is symmetric. 
For the converse, suppose that A = A‘. Then by 
Exercise 36, 


u-(Av) = (4'u)-v = (4u)-v 


Section 6.2 


1. Since (u, u) = 0 when uy = 3u or uw; = u2, V is not an 


13. 


15. 


17. 


19. 


21. 


23. 


inner product space. 


. Since (u + v, w) and (u, v) + (v, w) are not equal for all 


u, v, and w, V is not an inner product space. 


. Yes, V is an inner product space. 
. Yes, V is an inner product space. 
. Yes, V is an inner product space. 
11. 


x T 
J sinx dx = cosx dx 
= = 


T 
= cosx sinx dx = 0 


fy Qx —1)de =0 
Io (Cx2+x-}) de =0 


i (—2x3 + 3x? — 4x + 1) dx =0 


a. |x —e* l= 1/5e?- # 
— _2v3 

b cn ae 

a. || 2x2 — 4 |= 2/5 

b. cos§@ = —2 


3 
| 2 5 
a. || A—B |= "|; p [=v 


b. cos = = 


5/6 
8 0 8 
a. |4-B I= |r| 0 3 4 | =V25=5 
8 4 14 


26 
V38V39 


b. cosð = 


Answers to Odd-Numbered Exercises 


s {Ly =o) 


x 
27. y 
z 


2x — 3y +z =0 


29. a. EE = fix? de = l 
b. (e*,e%) = i dx =1 


el =V fia =1 
Ix l= y fi x? de = £ 


EA 
d. cos 8 = 773 


e |1-x l= $ 


469 


31. Iff is an even function and g is an odd function, then 


Jg is an odd function. Then 


S(x)g(x) dx =0 


—a 
so f and g are orthogonal. 
33. (cui, C202) = c1 (U1, C202) 


= C€1C2 (U1, U2) 
=0 


Section 6.3 


3 
1. a. proj, u = | ; 
2 
1 
i | -| | 
.U—projyu= | f 
2 


-1 
v-(u — projyu) = | 1 


——S | 
I 
NI NI 
— 

ll 
i=) 


3 
3. + psi =| : 
5 
8 
b. u= proia =| d 
5 


E 


vle ulo 
ki 
Il 
© 


5. a. proj, u = 


WE WI Wi 


470 


Answers to Odd-Numbered Exercises 


1 
3 
b. u — proj, u = 3 
4 
a3) 
1 
1 3 
v-(u — proj,u) = | —1 3 | =0 
3 
0 
. a. proj, u = 0 
-1 
1 
b. u — proju = | —1 
0 1 
v-(u—proj,u)= | 0 |-]} —1 | =0 
1 
. a. proj, p = 3x — 3 
b. p — proj,p =x? -—3x+ E 
(q.p — proj, P) 
1 
17 
= | Bx—1)(x?--x+—)ad&= 
f (3x I(x ae a) he 
a. proj, p = —4x° +4 


11. 


13. 


15. 


mn 


17. 


19. 


b. p — proj, P = Bx 


2_3 
4 


(q,p — Proj, P) 


ial Hal 
2| a ie a ee 


{v3 - D, 3x — 1, 6V5(x? -x+p} 


R\ eo deca 2 
2H Hea Me ea 
V3} 17 V6] _1 


Ww 


21. 


23. 


25. 


27. 


29. 


31. 


33. 


35. 


37. 


—l1 —2 1 
1 —2 1 1 1 0 
AI or æl- æ- 
1 0 1 
{ v3x, 3x +2} 
1 
1 0 1 1 
a} 1) | -i 
1 1 
Let 
V = cCyWy + C2U2 +++ + Cn Un 
Then 
Iv? =v-v 
= cf (uy-uy) + cå(u2:w2) + -+ + + C2(Uy Un) 
=c ++ He 
= |veuy|? +- + [vu]? 
Since 
n . 
0 ifi Æj 
X anay = ied = (44), 
oar 1 ifi =] 


then A'A =]. 
Since ||4x|| = /Ax-Ax and 


Ax-Ax = x'-(A‘Ax) = x-x 
then ||Ax||? = x-x = ||x||? so ||Ax|| = [|x|]. 
By Exercise 32, Ax-Ay = x-y. Then Ax-Ay = 0 if and 
only if x-y = 0 
Let 
W ={v| v-u; = 0 for all i = 1,2,...,m} 
If c is a real number and x and y are vectors in W, then 
(x+cy)-u; = x-u; +cy-u; =0+c(0) =0 
for alli = 1,2,...,n 


ewer) SJE 


= 3x? + 2xy + 3y? 
> @ +y?’ z0 
x‘ A' Ax = (Ax)'Ax = (Ax)-(Ax) 
= ||Ax||? 


41. Since Ax = Xx, then x‘Ax = ||x||?. Since A is positive 
definite and x is not the zero vector, then x’Ax > 0, so 


n> 0. 

Section 6.4 
1. W- = span 
3. W- = span 
5. WŁ = span 
7. WŁ = span 

tl 

3 

1 

9 -3 

1 

1 

2 

—3 
11. 2 |, 

1 

0 


13. { Px? — x 


15. 


SS ee 


= oO 


| 
mul wn 


O = MRA 


= O NF NIE 


+1} 


17. An orthogonal basis is 


proj, v = 


19. proj; v= 


B= 0 


N 


io) 


Answers to Odd-Numbered Exercises 


21. An orthogonal basis is 


23. 


25. 


a1 [= 
0 21 
PSS E a 
2 6 
-5 
: 21 
Projy v= 75] 3 
6 


1 = 
b. proj, Y = — 
proiv = 75 | 


c = roj ae 3 
.-u=V-p in Y= 75 9 


“aha 


Notice that the vectors vı and v2 are not orthogonal. 
Using the Gram-Schmidt process orthogonal vectors 


with the same span are 


1 0 
1 3 
-1 3 
a. Wt = span -1 
1 
1 2 
b. projyv=-= | 5 
Sla 
1 4 
c. u=v-projyy= 3 =2 


471 


472 Answers to Odd-Numbered Exercises 


2 
d. Since u is a scalar multiple of | —1 j, then u is 


in Wt. 


27. Let w € WŁ, so (w,u) = 0 for all u € W2. Since 
W, C Wo, then (w, u) = 0 for all u € W1. Hence 
we WE, so WŁ c WE. 


29. a. Leta=| 4 < | ana 8 = | 
Je 


ie ad+bf ae+bg 
7 bd+cf be+cg 


So A € W+ if and only if ad + bf + be + cg =0 


for all real numbers a,b, and c. This implies 


A= | k i | That is, A is skew-symmetric. 


ab a He ob: 
(cable Slee S 


Section 6.5 


5 
Lates]? 
a=] | 
5 
2 
b. w = AX = 5 
2 
5 
3 
2 
w=b-w = -3 
0 


b. 


b. 
7. a. 


653,089 317,689,173 


Y= 3148 ~ 3287 


y = 0.07162857143x — 137.2780952 
p2(x) = 2sinx — sin 2x 
2 
p(x) = 2sinx — sin 2x + 3 sin 3x 
: : 2 
p4a(x) = 2sinx — sin 2x + 3 sin 3x 
Eg 4. 
z sind 
. . 2. 
ps(x) = 2sinx — sin2x + 3 sin 3x 


ae Oe das 
z sindx + | sin 5x 


2 _ Acosx + cos 2x 


1 
Pox) = 37 


2 4 
— 4cosx + cos 2x — — cos 3x 


1 
ps(x) = 57 

l> 4 
pax) = 37 — 4 cosx + cos 2x — aes 


1 
= 4 
ag OOS x 


1 4 
ps(x) = gu —4cosx + cos2x — g 0983x 


1 4 
+ — cos 4x — — cos 5x 
4 25 


Section 6.6 
1. Xi = 3, ^2 = —1 
3. i = 1, M2 = —3, M3 = 3 


5. X1 = —3 with eigenvector vı = | 


eigenvector V2 = | 1 


1 

7. ^i = 1 with eigenvector v; = | 0 
1 

—1 

eigenvector v2 = 2 

1 

-1 

v= | -l 

1 


V2°V3 = 0. 


9. V3 = span 0 


—1 0 
V_, = span 0 },] 1 
1 0 
3 
1 
11. V3 = span 1 


jä 


=) | na = 2 with 


2 } Observe that v;-v2 = 0. 


; A2 = —3 with 
; 43 = 3 with eigenvector 


. Observe that v}-v2 = v1-v3 = 


13. 


15. 


23. 


25. 


P 
. P 
P 


Answers to Odd-Numbered Exercises 473 


V_, = span 


dim (V3) + dim (V_3) + dim (V_) 


=14+14+2=4 


Yes. 


ees 


Yes. 
vV2/2 V2/2 07) [ v2 


/ 
—J2/2 J2/2 0 2/2 


—1//2 
1//2 


1/72 
1//2 


} 
| = 
| 


—1/v6 
—2/V/6 
1/6 


1/V3 0 
1/ v3 1/V2 
10 0 
D=|0 2 0 

0 0 -2 
Since 4A’ = BB’ = 1, then 


(4B)(4B)' = AB(B‘4') 


-1/2 
1/2 3/2 


I-| 


or 
re oO 
uN 


= A(BB')A' = AIA’ 


=AA' =] 


Similarly, (BA)(BA)! = 1. 


Since AA’ = I, A’ is the inverse of A so A'A = I and 


hence A‘ is also orthogonal. 


474 Answers to Odd-Numbered Exercises 


27. a. Since cos? 0 + sin? 6 = 1, then 
cos@ —sin0@ cosO sind | | 1 0 
sin@ ~—cos@ —sind cos? | | 0 1 
29. If D = P'AP, then 
D' =(P'AP) = P‘A'P 

Since D is a diagonal matrix, then D* = D, so 

D = P'AP and hence P'AP = P'A'P. Then 

P(P'AP)P! = P(P'A'P)P', so A = Al. 

3L. a. vv =v? + +v? 

b. The transpose of both sides of the equation Av = Nv 
gives v'A’ = iv’. Since A is skew-symmetric, 
v'(—A) = iv‘. Now, right multiplication of both 
sides by v gives v‘(—Av) = Av'v, so v'(—Av) = 
rv'v. Then 2\v'v = 0 so 2X(v? +--+. +2) = 0 and 
this gives } = 0. 

Section 6.7 
1. 3002 + V10x’ = 0 
3. 2@¢’P +0’P =1 
YP _ oP _ 
5 5- =! 


7. a. [x nle e ]-1s=0 


b. 10x? — 12xy + 10y? — 16 =0 


9. a. Tx? +6V/3xy + 13y? — 16 =0 
b. (x — 3) + 6V3(x — 3) — 2) + 13 — 2}— 


16=0 
Section 6.8 
1. oi =4 10,02 =0 
3. o1 = 23,02 = V5, 03 = 0 
A A: A A 
5. A= V2 V2 8 0 v2 v2 
Peet) | elle eels 0 2 SORER NN 
V2 V2 v2 V2 
0 tL + 
~ | 1 0 0 1 0 0 1 1 
valde -xf2 


9. axi = 2x4 =0b.x, = l xz = 1 
c. 01/02 © 6,324,555 


Review Exercises Chapter 6 


1 1 2 1 0 0 
l. a 0 0 1 | — |0 1 0 
1 0 0 0 0 1 
0 J2/2 J2/2 
b. 1l, 0 , 0 
0 —/2/2 2/2 
—2 
c. projyv = 0 
-1 
x 
3. a. If | y | €W, then 
zZ 
x a 
y b | =ax+by+cz =0 
Zz c 
a 
so | b | isin W+. 
c 
a 
b. W+ = span b 
c 


a 
That is, W+ is the line in the direction of | b 

c 
and which is perpendicular (the normal vector) to 
the plane ax + by + cz = 0. 


a 
axı +bx7+cx3 


c projyL V= -27b24e2 


: axı +bx2+cx 
d. || proj Y l= ae 
Note: This gives the distance from the point (x1, x2, x3) 
to the plane. 

5. a. (1,cosx) = f7, cosx dx =0 


; x 
(1,sinx) = f7 sinx dx = 0 
; x ; 
(cosx, sinx) = JE cosx sinx dx = 0 
b. $ —~, + cosx, + sinx 
Vin? VT LT 
c. projyx? = in? —4cosx 


d. || projyx? = į V215 + 1440 


Answers to Odd-Numbered Exercises 475 


7. Using the properties of an inner product and the fact 7, 


T 8. F 
that the vectors are orthonormal, 
9, F 10. F 
Iv I= vvv 11. T 12. T 
= e (vi vi) te + c2 (Yn, Vn) 13. T 14. T 
15. F 16. T 
=y yit te 17. T 18. T 
If the basis is orthogonal, then 19. F 20. T 
z 21. F 22. T 
Iv I= y cf (vi. v1) +-+ + cå (Vn, Vn) 
23. F 24. F 
e th et te 25. F 26. F 
9. a 1 -l 2 _ 0 1 0 27. T 28. T 
- ee a 29. T 30. F 
1 -l 2 0 0 0 
31. T 32. T 
1 
i E 33. F 34. F 
a) T 6. F 
b B= sl} ib] 2h og = i 
2 37. T 38. T 
1 ; 0 
~?2 39. F 40. T 
5 5 me Appendix A 
1 —1 
e B= 2 2 0 Section A.1 
a iP) apl Z 
2 2 2 1. ANB = {-2,2,9} 
1 1 0 
2 2 
3. AxB = ({(a,b)| a € A,b € B} 
1 1 v2 by 
2 7 7D There are 9 x 9 = 81 ordered pairs in A x B. 
1 1 
tes 0 
ge || 2 a? 5. A\B = {—4,0, 1,3, 5,7} 
cea ee ae 
2 2 2 7. ANB = [0,3] 
1 1 
3 73 0 
9. A\B = (—11,0) 
2 -l1 2 11. A\C = (—11,—9) 
R=]| 0 1 -2 
0 0 v2 13. (4UB)\C = (-11,—9) 
e. 4=OR ig 
ah 
Chapter Test: Chapter 6 
1. T 2. T -H 
3. F 4. F 
5. F 6. T >] 


476 Answers to Odd-Numbered Exercises 


19. ay 


21. (ANB)NC = {5} =AN(BNC) 
23. AN (BUC) = {1,2,5,7} =(ANB)U(NC) 
25. A\(B UC) = {3,9,11} = (A\B) N(A\C) 


Section A.2 


1. Since for each first coordinate there is a unique second 
coordinate, f is a function. 


3. Since there is no x such that f(x) = 14, the function is 
not onto. The range of f is the set {—2, —1,3,9, 11}. 


5. fid-2) = {1,4} 
7. Since f is not one-to-one, f does not have an inverse. 
9. {(1, —2), (2, — 1), (3, 3), (4,5), (5, 9), (6, 11)} 
11. f(4 U B) = f((—3,7)) = [0,49) 
f(A) Uf(B) = [0,25] U [0, 49] = [0, 49] 
13. f(4 N B) = f({O}) = {0} 
F(A) NF (B) = [0,4] 9 [0,4 = [0,4 
Therefore, f(4 AB) C f(A) Nf (B), but 
FANB)A f(ANFB). 
15. fœ) = = 
17. If n is odd, then f(x) = —x + c. If n is even, then 
fw) =x. 


19. a. To show that f is one-to-one, we have 


emia! = e221 
> 2x1 -l= 2x —1 
ox, =x. 


b. Since the exponential function is always positive, 
f is not onto R. 


c. Define g : R > (0,00) by g(x) = e™7!. 

. g7!œ)= i1 + lnx). 

21. a. To show that f is one-to-one, we have 2n; = 2n if 
and only if nı = m. 


a 


b. Since every image is an even number, the range of f 
is a proper subset of N. 


c fE) =N; f'0)=ọġ 
23. a. f(A) = {2k +1| keZ} 
b. f(B)={2k+1]| kez} 
e. fd) = {(m,n) | n = —2m} 
d. f-'(E) = {(m,n) | n is even} 
e. f!(O) = {(m,n) | n is odd} 
f. Since f((1, —2)) = 0 = f((0,0)), then f is not 


one-to-one. 
g. Ifz € Z, let m = 0 and n =z, so that f(m,n) =z. 


Section A.3 
1. If the side is x, then h? = x? + x? = 2x?, so 


h = J2x. 
3. If the side is x, then the height is h = Bx, so the area 
is A= 1x Bx = By? 


5. If a divides b, there is some k such that ak = b; and if 
b divides c, there is some £ such that b£ = c. Then 
c = b£ = (akl = (ka, so a divides c. 


7. If n is odd, there is some k such that n = 2k + 1. Then 
n? = (2k +1)? = 2(2k? +k) + 1, so n? is odd. 


9. If b =a + 1, then (a+b) = Qa + 1} = 
2(2a? + 2a) +1, so (a +b)? is odd. 
11. Let m = 2 and n = 3. Then m? + n? = 13, which is 
not divisible by 4. 


13. Contrapositive: Suppose n is even, so there is some k 
such that n = 2k. Then n? = 4k?, so n? is even. 


15. Contrapositive: Suppose p = q. Then 
JPT = VP =p = P +9)/2. 

17. Contrapositive: Suppose x > 0. If € = x/2 > 0, then 
x >e€ 

19. Contradiction: Suppose J2 = p/q such that p and q 
have no common factors. Then 2q? = p?, so p° is even 
and hence p is even. This gives that q is also even, 
which contradicts the assumption that p and q have no 
common factors. 

21. If 7xy < 3x? + 2y?, then 3x? — 7xy + 2y? = 
(3x — y)(x — 2y) > 0. There are two cases: either both 
factors are greater than or equal to 0, or both are less 


than or equal to 0. The first case is not possible since 
the assumption is that x < 2y. Therefore, 3x < y. 


23. Define f:R — R by f(x) =x. Let C = [—4,4], 
D = [0,4]. Then f~!(C) = [—2,2] =f7~!(D) but 
C ÉD. 


25. If x € f7!(C), then f(x) € C. Since C C D, then 
f(x) € D. Hence, x € f7! (D). 


27. If y € f(A\B), there is some x such that y = f(x) with 
x E€ A and x ¢ B. So y € f(4)\ f (B), and 
f(A\B) Cf(A)\ f(B). Now suppose y € f(4)\ f(B). So 
there is some x € A such that y = f(x). Since f is 
one-to-one, this is the only preimage for y, sox € A\B. 
Therefore, f(A)\ f(B) C f(A\B). 


29. By Theorem 3 of Sec. A.2, f(f-'(C)) C C. Let y € C. 
Since f is onto, there is some x such that y = f(x). So 
x ef —'(C), and hence y = f(x) € ff71(C)). 
Therefore, C C f(f~!(C)). 


Section A.4 


1. Base case: n = 1:17 = 190) 


Inductive hypothesis: Assume the summation formula 
holds for the natural number 7. 
Consider 


1+2 +3 +n +041)? 


2 n(n + — +1) Rye 1 


1 
= "Sn? +7n +6) 


"tlon +3)(n + 2) 


_ a+ DM + 2)Qn + 3) 
E 6 


3. Base case: n = 1:1 
Inductive hypothesis: Assume the summation formula 
holds for the natural number n. 

Consider 


— 16-D 
A 2 


14449 454.46r-2)4 Ba +1) -2 
3n — 1 
= OD Grtn 
3n? +5n +2 
2 
_ (n+ DGn +2) 
E 2 


Answers to Odd-Numbered Exercises 477 


5. Base case: n = 1 : 2 = 1® 


Inductive hypothesis: Assume the summation formula 
holds for the natural number n. 
Consider 


2+5+8+--- +(3n—1)+ [B+ 1)-— 1] 
= 5m? +7n +4) 
_ @+DGn+4) 


2 
_ + DEW +D+D 
E 2 
7. Base case: n = 1:3 = +® 


Inductive hypothesis: Assume the summation formula 
holds for the natural number n. 
Consider 


34+64+9+4+---+3n+3(7+ 1) 
1 
= 5 Gn? + 9n +6) 


3 
= 50° + 3n +2) 


— 3a +1 +2) 
= 


9. Base case: n = 1:2! =2?-2 
Inductive hypothesis: Assume the summation formula 
holds for the natural number n. 


Consider 
n+l n 
yo = es 42741 
k=1 k=1 
= grt Sga pnt 
= gnt2 Z% 


11. From the data in the table 


n|2+4+.---+2n 
1 2 = 1(2) 
2 6 = 2(3) 
3 12 = 3(4) 
4 40 = 4(5) 
3 30 = 5(6) 


we make the conjecture that 


2+4+6+---+(2n)=n +1) 


478 


13. 


15. 


17. 


Answers to Odd-Numbered Exercises 


Base case: n = 1:2 = 1(2) 

Inductive hypothesis: Assume the summation formula 
holds for the natural number n. 

Consider 


2+44+6+---+2n+2(n +1) 
=n(n+1)+2(n +1) 
=(n+1)(n +2) 


Base case: n = 5 : 32 = 25 > 25 = 5? 

Inductive hypothesis: Assume 2” > n? holds for the 
natural number n. 

Consider 2”+! = 2(2”) > 2n?. But since 2n?— 

(n +1)? =n? —2n — 1 = (n — 1}? — 2 > 0, for all 

n > 5, we have 2”*! > (n + 1). 

Base case: n = 1 : 1? + 1 = 2, which is divisible by 2. 
Inductive hypothesis: Assume n? + n is divisible by 2. 
Consider (n + 1)? + (n + 1) =n? +n + 2n + 2. By the 
inductive hypothesis, n? + n is divisible by 2, so since 
both terms on the right are divisible by 2, then 

(n + 1)? + (n + 1) is divisible by 2. Alternatively, 
observe that n? +n = n(n + 1), which is the product of 
consecutive integers and is therefore even. 


Base case: n = 1 : 1 = 1H 
Inductive hypothesis: Assume the formula holds for the 
natural number n. 


Consider 


l+rtr’?+..- 


19. Base case: n = 2 : A N (Bı U B2) = (ANB) U (A A B2), 
by Theorem 1 of Sec. A.1 
Inductive hypothesis: Assume the formula holds for the 
natural number n. 
Consider 
AM(B, UB, U- -- UB, UBn41) 
= AN[(B) UB) U+--UB,)UByyi] 
= [AN (Bı U Bz U+- U Bn)] U (AN Bast) 
=(AN Bı)U (4N B2)U --- U (AN Bpa) U (4N Bray) 


21. n\ _ n! 
(") ~ rin —r)! 


n! 


T n= r)n — (n r)! 


as) 


23. By the binomial theorem, 


2 =atiy'=> (f) 


k=0 


Index 


A 


Abel’s formula, 190 
Addition 
linear transformations and, 209—210 
matrix, 27—29 
of real numbers, 28 
of vectors, 95—99, 129 
Additive identity, 98 
Additive inverse, 97, 98, 136 
Aerodynamic forces, 199 
Algebra, matrix, 26—37. See also 
Matrix algebra 
Angles, between vectors, 
327-330 
Arguments 
contradiction, 426—427 
contrapositive, 426 
direct, 425 
Associated quadratic form, 386 
Associative property, 97, 98 
Augmented matrix 
for consistent linear systems, 22 
explanation of, 15, 16 
facts about, 23 
as solution to linear systems, 16-17 
Axioms, 424 


Back substitution, 4 

Balance law, 310 

Balancing chemical equations, 79 

Basis 
change of, 177—181 
explanation of, 149 
facts about, 171 
method for finding, 166—170 
ordered, 174—176 
orthogonal, 339—340 
orthonormal, 342—352 
standard, 159, 162, 163 
for vector space, 159—164 


Best-fit line, 322 

Bijective functions, 420, 422 
Bijective mapping, 226 
Bilinear inner product, 333 
Binary vectors, 127—128 
Binomial coefficients, 435—436 
Binomial theorem, 437—438 
Brin, Sergey, 276 


Cc 


Cartesian product, of two sets, 411 
Cauchy-Schwartz inequality, 326-327, 
336, 337 
Characteristic equation, 279 
Characteristic polynomials, 279 
Check matrix, 128 
Chemical equation balancing 
application, 1, 79 
Circle, equation of, 385 
Codewords, 127—129 
Coefficient matrix, 15 
Cofactors, of matrices, 56 
Column rank, 221, 222 
Column space, 152 
Column vectors, 27 
Commutative operation, 28 
Commutative property, 97, 129 
Commute, matrices that, 32, 33 
Complement, of sets, 411 
Complementary solution, 193 
Complex numbers 
conjugate of, 377 
equality and, 377 
imaginary part, 144 
real part, 144 
set of, 134 
Components, of vectors, 27, 95 
Composition, of functions, 
420—422 
Computer graphics applications 
explanation of, 255 
projection, 265—268 


479 


480 


Index 


Computer graphics applications (continued) 
reflection, 260, 261 
reversing graphics operations, 261—262 
rotation, 264—265 
scaling and shearing, 256-259 
translation, 262—264 
types of, 199 
Conclusions, 424 
Condition number, 403 
Conic sections 
eigenvalues and, 388-390 
explanation of, 61, 386 
simplifying equations that 
describe, 385 
Conservation of mass law, 1 
Contained sets, 410 
Continuous signals, 93 
Contraction, 257 
Contradiction argument, 426—427 
Contrapositive argument, 426 
Contrapositive statement, 425, 426 
Converse, of theorems, 424—425 
Convex, set on R? as, 272 
Corollaries, 425 
Cosine, of angle between vectors, 
328, 337 
Counterexample, 425 
Cramer’s rule, 62—64 


D 


Damping coefficient, 192 
Data compression, 401-403 
Data sets, least squares approximation 
to find trends in, 371—373 
Demand vectors, 83 
DeMorgan’s laws, 413-414 
Determinants 
facts about, 64—65 
to find equation of conic sections, 61 
linear independence and, 118—119 
method to find, 55-56 
properties of, 57—62 
to solve linear systems, 62—64 
of square matrices, 56 
of 3 x 3 matrix, 54-55 
of triangular matrices, 56-57 
of 2 x 2 matrix, 54, 55 
Diagonalization 
conditions for matrix, 291—292 
eigenvalues and, 293 


eigenvectors and, 282 
examples of, 289-291, 293 
explanation of, 287—289, 377 
facts about, 297—298 
linear operators and, 295-297 
orthogonal, 379—382 
similar matrices and, 293—294 
of symmetric matrices, 377—383 
symmetric matrices and, 293 
systems of linear differential equations 
and, 302-309 
of transition matrices, 312—313 
Diagonal matrix, 56. See also 
Diagonalization 
Differential equations 
applications for, 185, 191—193 
explanation of, 185 
first-order, 186 
fundamental sets of solutions 
and, 188—193 
general solution of, 186 
second-order with constant coefficients, 
186-188, 191-193 
Digraphs, 79-80 
Dilation, 257 
Dimension 
of column space, 221 
explanation of, 164—165 
of vector space, 165-166 
Direct argument, 425 
Directed graphs, 79-80 
Direction field, of systems of differential 
equations, 302 
Direct sum, 360—361 
Discrete signals, 93 
Distance formula, 323 
Domain, 415 
Dot product 
on Euclidean n-space (R”), 323-331, 333 
inner product that is not, 335 
properties of, 326 
of vectors, 29—32, 323 
Dynamical systems, 300 


Echelon form 

definition of, 19 

of matrices, 17—21 
Economic input-output models, 82-84 
Eigenfunctions, 284 


Eigenspaces 
algebraic multiplicity of, 281 
corresponding to eigenvalues, 
280-281 
dimension of, 281—282 
explanation of, 279—280 
geometric multiplicity of, 282 
Eigenvalues 
as complex numbers, 282—283 
diagonalization and, 287—298 
explanation of, 276 
facts about, 284—285 
geometric interpretation of, 278—279 
of linear operators, 283—284 
Markov chains and 
method to find, 310-314 
of square triangular matrices, 283 


systems of linear differential equations 


and, 302—309 
transition matrix and, 312, 313 
of triangular matrices, 283 
for 2x2 matrices, 277—278 
Eigenvectors 
definition of, 276—277 
diagonalization and, 287—298 
explanation of, 276 
facts about, 284—285 
geometric interpretation of, 
278-279 
of linear operators, 283—284 
Markov chains and 
method to find, 310-314 
orthogonal, 379—380 
probability, 313 
of real symmetric matrices, 378 


systems of linear differential equations 


and, 300-309 
transition matrix and, 312, 313 
for 2x2 matrices, 277—278 
Elementary matrix 
definition of, 69 
explanation of, 69-71 
inverse of, 71—72 
Elements, of sets, 409 
Empty sets, 410 
Equal matrix, 27 
Equations, matrix, 48—51 
Equivalent linear systems, 4, 5 
Equivalent triangular systems, 6—7, 
10, 14 


Index 


Euclidean n-space (R”) 
definition of, 95 
dot product on, 323-331, 333 
properties of norm in, 329 
vectors in, 94—99, 108, 119, 
323-331 
Euclidean space 
dot product and, 323-331 
geometry of, 7 
Euclidean vector spaces, 130 
Existential quantifiers, 427 
Exponential model, 186 


F 


Factorials, 432 
Finite dimensional vector spaces 
explanation of, 165 
isomorphisms and, 230 
linear operators and, 243—244 
linear transformations between, 
236, 237 
Forward substitution, 68—69 
Fourier polynomials, 373—375 
Free variables, 9 
Frequency, of wave, 93 
Functions 
composition of, 420—422 
explanation of, 415—418 
inverse, 418—421 
one-to-one, 419 
onto, 420 
relations vs., 415, 416 
vector space of real-valued, 
133-134 
Fundamental frequency, 
93-94 
Fundamental sets of solutions 
superposition principle and, 188—189 
theorem of, 190-191 
Wronskian and, 189—191 


G 


Gaussian elimination 

explanation of, 4 

to solve linear systems, 6—11, 

14, 15, 68 

Gauss-Jordan elimination, 19 
General solution, 3 
Goodness of fit, measurement of, 366 
Google, 276 


481 


482 


Index 


Gram-Schmidt process 
examples of, 349—352 
explanation of, 344, 347—348, 394 
geometric interpretation of, 

348—349 

Graphics operations in R? 
reflection, 260, 261 
reversing, 261—262 
rotation, 264—265 
scaling and shearing, 256—259 
translation, 262—264 

Graphs 
of conic sections, 61 
of functions, 416 


H 


Hamming, Richard, 127 
Hamming’s code, 127, 129 
Homogeneous coordinates, 262—264 
Homogeneous linear systems, 
49-51, 113 
Horizontal line test, 419 
Horizontal scaling, 256, 257 
Horizontal shear, 258 
Hypothesis 
explanation of, 424 
inductive, 430 


Identity matrix, 39 
Images 
explanation of, 415, 418 
inverse, 416, 418 
Imaginary part, complex 
numbers, 134 
Inconsistent linear systems 
explanation of, 2, 10 
reduced matrix for, 21—22 
Independence, linear. See Linear 
independence 
Inequality, Cauchy-Schwartz, 326-327 
Infinite dimensional vector space, 165 
Initial point, vector, 95 
Initial probability vectors, 275 
Initial-value problems, 
186, 192-193 


Injective mapping. See One-to-one mapping 


Injective functions, 419 
Inner product 
examples of, 334—336 


explanation of, 333 
that is not dot product, 335 
Inner product spaces 
diagonalization of symmetric matrices 
and, 377—383 
explanation of, 333—334 
facts about, 340 
least squares approximation and, 
366-375 
orthogonal complements and, 355-364 
orthogonal sets and, 338—340 
orthonormal bases and, 342—352 
properties of norm in, 336-337 
quadratic forms and, 385-391 
singular value decomposition and, 
392-403 
subspaces of, 355 
Input-output matrix, 83 
Integers, set of, 409 
Internal demand, 83 
Intersection, of sets, 410, 411 
Inverse functions 
explanation of, 418—420 
unique nature of, 421 
Inverse images, 416, 418 
Inverse of elementary matrix, 71—72 
Inverse of square matrix 
definition of, 40 
explanation of, 40—45 
facts about, 45 
Inverse transformations, 230—231 
Invertible functions, 418—420 
Invertible matrix 
elementary matrices and, 72 
explanation of, 41, 54 
inverse of product of, 44—45 
square, 60—61 
Isomorphisms 
definition of, 229 
explanation of, 226 
inverse and, 230—231 
linear transformations as, 
229-231 
one-to-one and onto mappings 
and, 226—230 
vector space, 232—233 


Kepler, Johannes, 61 
Kirchhoff s laws, 88 


L 
Law of conservation of mass, 1 
Leading variables, 10 
Least squares approximation 
background of, 366-368 
to find trends in data sets, 371—373 
Fourier polynomials and, 373-375 
linear regression and, 371—373 
Least squares solutions, 369—371 
Lemmas, 425 
Length, of vectors, 95 
Leontief input-output model, 82 
Linear codes, 129 
Linear combinations 
definition of, 102, 146 
of elements of fundamental 
set, 94 
matrix multiplication and, 107 
of vectors, 102—106, 146 
Linear dependence 
definition of, 111, 157 
explanation of, 111, 157 
of vectors, 112, 158 
Linear equations, in n variables, 3 
Linear independence 
definition of, 111, 157 
determinants and, 118—119 
explanation of, 111—112 
of vectors, 112—117, 158 
Linear operators 
diagonalizable, 295-297 
eigenvalues and eigenvectors of, 283—284 
explanation of, 202, 237 
similarity and, 249—252 
Linear regression, 368, 371-373 
Linear systems 
augmented matrices to solve, 16—17, 22 
consistent, 2, 117 
converted to equivalent triangular 
systems, 6—7, 10, 14 
Cramer’s rule to solve, 62—64 
definition of, 3 
discussion of, 3—4 
elimination method to solve, 4-11, 
14, 15 
equivalent, 4, 5 
explanation of, 2—3 
facts about, 11—12 
with four variables, 2, 8—9 
homogeneous, 49-51, 113 


Index 483 


ill-conditioned, 403 
inconsistent, 2, 10, 21—22 
linear independence and, 117—118 
LU factorization to solve, 75—76 
matrix form of, 48 
nullity of matrices and, 222—223 
in terms of geometric structure of 
Euclidean space, 363—364 
3 x, 3, 7-8 
triangular form of, 4, 6, 10 
with two variables, 2—3 
vector form of, 106—107 
vector form of solution to, 
48-50 
Linear systems applications 
balancing chemical equations, 79 
economic input-output models, 82-84 
network flow, 79-81 
nutrition, 81—82 
Linear transformations 
computer graphics and, 199, 
255-268 
definition of, 202, 235 
explanation of, 200—202, 235-236 
from geometric perspective, 
203-204 
inverse of, 230 
as isomorphisms, 229—231 
isomorphisms as, 226—233 
matrices and, 202—203, 221—222, 
235-245 
null space and range and, 214—223 
operations with, 209—210 
similarity and, 249—253 
Lower triangular matrix 
examples of, 57 
explanation of, 56, 73 
LU factorization 
facts about, 77 
of matrices, 69, 72—75, 392 
solving linear systems using, 75—76 


M 
Mapping 
bijective, 226 
explanation of, 200—201, 415 
linear transformations and, 201—202, 
205-207, 241—242 
one-to-one, 226—230 
onto, 226, 227 


484 


Index 


Markov chains 
applications of, 310—314 
explanation of, 275—276 
Markov process, 310 
Mathematical induction 
base case, 430 
binomial coefficients and binomial 
theorem and, 435—438 
examples of, 431-435 
inductive hypothesis, 430 
introduction to, 429—430 
principle of, 430—431 
Matrices 
addition of, 27—29 
augmented, 15-17, 22, 23 
check, 128 
coefficient, 15 
condition number of, 403 
definition of, 14 
determinants of, 54—65 
diagonal, 56 
discussion of, 14—15 
echelon form of, 17—21 
elementary, 69—72 
finding singular value decomposition 
of, 398—402 
identity, 39 
input-output, 83 
inverse of product of invertible, 44—45 
inverse of square, 39—45 
linear independence of, 114 
linear transformations and, 202—203, 
221-222, 235-245 
LU factorization of, 69, 72—75 
minors and cofactors of, 56 
nullity of, 221—223 
null space of, 152—153 
orthogonal, 381-382 
permutation, 76—77 
positive definite, 354 
positive semidefinite, 354 
rank of, 222 
scalar multiplication, 27 
singular values of, 393—396 
stochastic, 275, 311, 314 
subspaces and, 362 
symmetric, 36 
that commute, 32, 33 
transition, 177—182, 275, 276, 
311-313 


transpose of, 35—36 
triangular, 15, 56—57, 283 
vector spaces of, 130 
Matrix addition, 27—29 
Matrix algebra 
addition and scalar multiplication, 
27-29 
explanation of, 26—27 
facts about, 36—37 
matrix multiplication, 29—35 
symmetric matrix, 36 
transpose of matrix, 35—36 
Matrix equations, 48—51 
Matrix form, of linear systems, 48 
Matrix multiplication 
definition of, 32 
explanation of, 29—35, 210 
linear combinations and, 107 
linear transformations between finite 
dimensional vector spaces and, 
236-237 
properties of, 35 
to write linear systems in terms of 
matrices and vectors, 48—51 
Members, of sets, 409 
Minors, of matrices, 56 
Multiplication. See Matrix multiplication; 
Scalar multiplication, 
Multiplicative identity, 39 
Multivariate calculus, 322 


Natural numbers. See also Mathematical 
induction, 
set of, 409 
statements involving, 
429—434 
Network flow application, 
79-81 
Newton, Isaac, 61 
Nilpotent, 299 
Noninvertible matrix, 41 
Normal equation, least squares solution 
to, 369-370 
Nullity, of matrices, 221—223 
Null sets, 410 
Null space, 
of linear transformations, 214—221 
of matrices, 152—153, 221 
Nutrition application, 81—82 


O 


One-parameter family, of 
solutions, 9 

One-to-one functions, 419 

One-to-one mapping, 226—230 

Onto functions, 420 

Onto mapping, 226, 227 

Ordered basis, 174—176 

Ordinary differential equation, 185 

Orthogonal basis 

construction of, 347—348 


of finite dimensional inner product space, 


346-347 

singular values and, 394 

vectors that form, 355 
Orthogonal complement 

definition of, 357 

examples of, 358—360 

explanation of, 355-358 

facts about, 364 

inner product spaces and, 356 

linear systems and, 363—364 

matrices and, 362 

projection theorem and, 361—362 

subspaces and, 358 
Orthogonal diagonalization, 379—382 
Orthogonal matrix, 381—382 
Orthogonal projection 

explanation of, 343—345, 360, 362 

Gram-Schmidt process and, 

347, 348 

Orthogonal sets 

explanation of, 338 

properties of, 338—340 
Orthogonal vectors 

explanation of, 328—329, 337 


in inner product spaces, 338 (See also 


Inner product spaces) 


subspaces of inner product spaces and, 


355-360 
Orthonormal basis 
Gram-Schmidt process and, 
347-352 
for inner product space, 345—347 
ordered, 339—340, 342 
orthogonal matrices and, 381 
orthogonal projections and, 
343-345 
Orthonormal vectors, 338 


Index 


P 
Page, Larry, 276 
Page range algorithm (Google), 
276 
Parabolas, general form of, 11 
Parallelogram rule, 96 
Parallel projection, 266 
Parametric equations, 266 
Pascal’s triangle, 435 
Past plane, 301—302 
Period, of wave, 93 
Periodic motion, 93 
Periodic signals, 93 
Permutation matrix, 76—77 
Phase portrait, 302—304, 306 
Photosynthesis application, 1—2 
Pitch, 199 
Pivot, 18, 19 
Pixels, 255 
PLU factorization, 76—77 
Polynomials 
characteristic, 279 
of degree n, 132 
derivative of constant, 217 
Fourier, 373—375 
trigonometric, 373—374 


use of Gram-Schmidt process on space 


of, 350-351 
vector space of, 132-133, 
163, 334 
zero, 132 
Positive definite matrix, 354 
Positive semidefinite matrix, 354 
Predator-prey model, 300 
Preimages, 416 


Principle of mathematical induction. 


See Mathematical induction, 
Probability vectors, 311 
Production vectors, 83 
Product matrix, 30—31 
Projection 

example of, 266—268 
explanation of, 265 
orthogonal, 343 
parallel, 266 
Projection theorem, 361—362 
Proofs 
contradiction argument, 426—427 
contrapositive argument, 426 
direct argument, 425 


485 


486 


Index 


Proofs (continued) 
explanation of, 424—425 
quantifiers, 427 

Propositions, 425 


Q 


Quadratic equations 
in three variables, 391 
in two variables, 385—388 
Quadratic forms 
applications for, 385 
associated, 386 
rotation of axes and, 
385-390 
Quadratic surfaces, 391 
Quantifiers, 427 


Range 
explanation of, 415 
of linear transformations, 
214-221 
Rank of a matrix, 222 
Rational numbers, set of, 410 
Real numbers 
addition of, 28 
set of, 410 
signals representing, 93 
Real part, complex numbers, 134 
Real-valued fuctions, 133—134 
Reduced matrix, for inconsistent linear 
systems, 21—22 
Reduced row echelon form 
explanation of, 18, 19, 23 
transforming matrices to equivalent, 
22-23 
Reflection, 260, 261 
Relations 
explanation of, 415 
functions vs., 415, 416 
Riemann integral, 334 
Roll, 199 
Rotation, 264—265 
Rotation of axes, 385—390 
Row echelon form 
explanation of, 17, 19 
reduced, 18—21 
Row equivalent, 16 
elementary matrices and, 72 
Row operations, 16, 58 


Row rank, of matrices, 222 
Row vectors, 27 


S 
Scalar multiplication 
explanation of, 27—29 
linear transformations and, 209—210 
of vectors, 95—99, 129, 161 
Scalar product 
of matrices, 27 
of vectors, 96 
Scalar projection, 343 
Scaling, 96, 256—258 
Scatterplots, 321 
Second-order differential equations, with 
constant coefficients, 186—188, 
191-193 
Sets, 
empty, 410 
explanation of, 409—410 
null, 410 
operations on, 410—414 
orthogonal, 338—340 
solution, 3 
Shearing, 258—259 
Signals, 93 
Similar matrix 
background of, 249-251 
explanation of, 252, 253 
Singular value decomposition (SVD) 
data compression and, 
401-403 
explanation of, 392 
four fundamental subspaces and, 401 
method for, 398—400 
theorem of, 396—398 
Singular values, 392 
definition of, 393 
of m x n matrix, 393—396 
Solutions, to linear systems with 
n variables, 3 
Solution set, 3 
Span, of set of vectors, 146—152 
Square matrix 
determinant of, 56 
inverse of, 39—45 
invertibility of, 60 
trace of, 142-143 
Standard basis 
explanation of, 162, 163 


matrix representation relative to, 235—237 
polynomials of, 163 
Standard position, of vectors, 95 
State vectors, Markov chains and, 311—312 
Steady-state vectors 
explanation of, 276 
Markov chain and, 313—314 
Stochastic matrix, 275, 311, 314 
Subsets, 410, 412 
Subspaces 
closure criteria for, 144 
definition of, 140 
examples of, 142-143 
explanation of, 140—142 
facts about, 153 
four fundamental, 401 
of inner product spaces, 355—360, 
362 
null space and column space of matrix 
and, 152-153 
span of set of vectors and, 146—152 
trivial, 142 
of vector spaces, 140, 145, 146 
Substitution 
back, 4 
forward, 68—69 
Superposition principle, 188—189 
Surjective functions, 420 
Surjective mapping. See Onto mapping, 
Symmetric matrix 
diagonalization of, 
377-383 
explanation of, 36 
Syndrome vectors , 128 
Systems of linear differential equations 
diagonalization and, 302—309 
explanation of, 300 
to model concentration of salt in 
interconnected tanks, 
307-309 
phase plane and, 301—302 
uncoupled, 300—301 
Systems of linear equations. See 
Linear systems 


T 


Terminal point, vector, 95 
Theorems 
converse of, 424—425 
explanation of, 424 


Index 487 


Tower of Hanoi puzzle, 
429—430 
Trace, of square matrices, 
142-143 
Trajectories, 301—302 
Transformation, 199—200. See also Linear 
transformations 
Transition matrix 
diagonalizing the, 312-313 
example of, 275, 276 
explanation of, 177—180 
inverse of, 181—182 
Markov chains and, 
311-312 
Translation, 262—264 
Transpose, of matrices, 35—36 
Triangular form 
of linear systems, 4, 6—7, 10 
matrices in, 15 
Triangular matrix 
determinant of, 57, 58 
eigenvalues of, 283 
explanation of, 56—57 
Trigonometric polynomials, 373—374 
Trivial solution, to homogeneous systems, 
49, 50 
Trivial subspaces, 142 


U 


Uncoupled systems, 300—301 

Uniform scaling, 257 

Union, of sets, 410 

Unit vectors, 325 

Universal quantifiers, 427 

Universal set, 410 

Upper triangular matrix 
examples of, 57 
explanation of, 56, 68, 74 


Vv 


Vector addition, 95—99, 129 
Vector form 
of linear systems, 106—107 
of solution to linear systems, 48—50 
Vectors 
addition and scalar multiplication of, 
95-99 
algebraic properties of, 97—98 
angle between, 327—330 
applications for, 94 


Vectors (continued) 

binary, 127—128 

components of, 27, 95 

demand, 83 

distance between, 324, 336 

dot product of, 29—32, 323 

equal, 95 

in Euclidean n-space (R”), 94—99, 108, 
119, 323-331 

explanation of, 27, 94 

length of, 95, 323—325, 336 

linear combinations of, 102—106, 146 

linear independence of, 112—117 

orthogonal, 328—329, 337 

orthonormal, 338 

probability, 311 

production, 83 

span of set of, 146—152 

standard position of, 95 

state, 311—312 

steady-state, 276, 313—314 

syndrome, 128 

unit, 325 

zero, 97 


Vector space isomorphism, 229, 232—233 
Vector spaces, 


abstract, 136—137 

basis for, 159-164, 166-170, 174—182 
of complex numbers, 134—135 
definition of, 129—130 

dimension of, 165—166 


Euclidean, 130 
examples of, 130-135 
expanding set of vectors in, 169—170 
facts about, 137 
finite dimensional, 165, 230, 236, 237 
infinite dimensional, 165 
isomorphic, 229, 232—233 
linear transformations and, 204—208, 
211, 220 
of matrices, 130 
or real-valued functions, 133—134 
of polynomials, 132—133, 163, 334 
subspaces of, 140, 145, 146 
Venn diagrams 
example of, 411, 412 
explanation of, 410—411 
Vertical line test, 419 
Vertical scaling, 257 
Vertical shear, 259 


Ww 


Waveform, of signal, 93 
Waves, period of, 93 
Wronskian, 189—191 
Y 

Yaw, 199 


Z 


Zero polynomials, 132 
Zero vectors, 97 


Introduction to Linear Algebra 


with Applications 


Over the last few decades, linear algebra has become more relevant than ever. Applica- 
tions have increased not only in quantity but also in diversity, with linear systems being 
used to solve problems in chemistry, engineering, economics, nutrition, urban planning, 
and more. DeFranza and Gagliardi introduce students to the topic in a clear, engaging, 
and easy-to-follow manner. Topics are developed fully before moving on to the next 
through a series of natural connections. The result is a solid introduction to linear to 
algebra for undergraduates’ first course. 


Outstanding features include: 


e Early coverage of vector spaces, providing the abstract theory necessary to under- 
stand applications 


e Exercises that range from routine to more challenging, extending the concepts and 
techniques by asking students to construct complete arguments 


e Numerous examples designed to develop intuition and prepare readers to think 
conceptually about topics as they are introduced 


e Fact summaries to end each chapter that use nontechnical language to recapitulate 
details and formulas 


Waveland Press, Inc. 
www.waveland.com 


ISBN 13: 978-1-4786-2777-7 
ISBN 10: 1-4786-2777-8 
90000 


9781478627777 


